Chapter 11

Attack Identification and Prevention in Networked Systems

Abstract

An essential requirement for a large-scale networked system to work properly is its safety. In this chapter, we investigate some preliminary issues in prevention and estimation of malicious and organized inputs to a networked system, which include detectability and identifiability of attacks, appropriate sensor distributions, and so on. We establish connections with system observability and with system transmission zeros. It has been made clear that accurate system model is quite important in both risk managements and attack building. We also discuss some recent quantitative methods developed for analyzing risks of a networked system.

Keywords

controllability; least squares estimates; observability; SCADA; security; state estimation; transmission zero

11.1 Introduction

A large-scale networked system usually consists of a great amount of subsystems, and these subsystems are often spatially distributed and far from each other geometrically. This asks for timely data exchanges among subsystems, which can be supported by advanced communication technologies, including both wired and wireless channels. Examples include electric power systems, unmanned vehicle systems, water supply networks, and so on.

Although integrations of several subsystems are greatly expected to increase performances of the whole system, they also provide more opportunities of being affected by malicious operators. There are many real world examples illustrating these possibilities, and in some cases, significant damages have been caused to the physical processes targeted by the attackers [1]. One example is the advanced computer worm Stuxnet, which happened in 2010 and infected some industrial control systems, reportedly leading to damages of approximately 1,000 centrifuges at these plants, although the attack itself was rather naive from a control engineer's point of view [2]. Another example is the Maroochy water breach in 2000 [3]. In this incident, an attacker managed to hack into some controllers that activate and deactivate some valves, which caused flooding of the grounds of a hotel, a park, and a river, with about a million liters of sewage. In 2003, the SQL Slammer worm attacked the Davis–Besse nuclear plant [4]. The recent multiple power blackouts in Brazil are also believed to be caused by malicious attacks [5].

Differently from faults and disturbances in a control system, which happen naturally and occasionally, attacks may happen in a well-organized way, simultaneously or sequentially over several spatial places of a large-scale networked system, with the objective to significantly violate performances of the system, or even to destroy the system itself. In other words, although faults also affect system behaviors, simultaneous events are usually not considered to be colluding. On the contrary, simultaneous attacks are often cooperative. In addition, faults are usually constrained by some physical dynamics and do not have an intent or objective, but attacks are usually not straightforwardly restricted by the dynamics of the plant physical process and/or chemical process, usually have a malicious objective, and often perform in a secrete way.

System safety is in fact not a new topic [6,7]. Particularly, in a power system, a supervisory control and data acquisition system, which is usually abbreviated as SCADA, is well developed and has been widely adopted. This system is originally designed to prevent electricity theft and to monitor abrupt voltage changes. In this system, measurement data of plant static outputs are used to estimate plant states, and the estimates are normally used in an internal control area of a transmission system operator. A basic assumption adopted in SCADA is that the system is in a steady-state behavior within its computation interval, which is usually of tens of seconds. When the plant states have a fast change during some alert or emergency situations, its estimation accuracy deteriorates drastically. Moreover, new power generation sources, such as large-scale wind power penetration, smart transmission devices, and so on have not been taken into account appropriately. In addition, inaccurate estimations about the state of a directly connected subsystem may cause some misleading estimates about system security and lead to a wrong decision of security control. These factors ask investigations on attack preventions using dynamic input–output data.

On the other hand, with the development of communication technologies and computer technologies, it is extensively anticipated that a plant and even subsystems of a plant are connected by some public communication networks, such as the world wide web and so on, to reduce hardware costs and increase maintenance flexibilities. In such a situation, a communication network becomes also an essential part of a networked system, such as that played by a physical plant and a digital feedback controller. When a system is connected by communication channels, many other possibilities arise for the system to be attacked. For example, it becomes easier for a hacker to insert some destructive disturbances in a more secrete way into a signal being transmitted in a communication channel [79]. In particular, a cyber-physical system usually suffers from some specific vulnerabilities, which do not exist in classical control systems and for which appropriate detection and identification techniques are required. For instance, the reliance on communication networks and standard communication protocols to transmit measurements and control packets increases the possibility of intentional and worst-case attacks against physical plants and/or the controller. On the other hand, information security methods, such as authentication, access control, message integrity, and so on, may not be adequate for a satisfactory protection of these attacks. Indeed, these security methods are only some passive ways to prevent a signal being disturbed during its transmissions. They do not exploit compatibilities of the received signals with the underlying physical/chemical process of the plant and/or the control mechanism. These characteristics may make them ineffective against an insider attack targeting the physical dynamics and/or the controller.

Generally, the dynamics of a networked system with possible attacks can be described as

x ( k + 1 ) = f ( k , x ( k ) , u ( k ) , d ( k ) ) ,

Image (11.1)

y ( k ) = g ( k , x ( k ) , u ( k ) , d ( k ) ) ,

Image (11.2)

where x(k)RnImage, y(k)RmImage, u(k)RpImage, and d(k)RqImage respectively represent the plant state vector, output vector, input vector, and the vector consisting of possible attacks. Moreover, f(k,)Image and g(k,)Image are in general nonlinear functions that may be time varying. For a physical plant to work properly, there are usually some strict restrictions on its states. For example, in a water supply network, the levels of its reservoirs are in general restricted in some suitable ranges. The set consisting of these permissible plant states is called the plant safe region, whereas the complimentary set of the plant states is called the unsafe region. A plant is said to be safe if its state vector belongs to its safe region; otherwise, the plant is said to be unsafe. The objectives of the malicious attackers are often manipulating the plant state vector from its safe region to its unsafe region through adding an attack disturbance vector process d(k)Image into the system with the smallest efforts, as well as to avoid being detected. In other words, to manipulate the plant from being safe to being unsafe in a theft and economic way. The input signal u(k)Image is delivered from a controller through a communication network designed to make the plant work satisfactory. Due to the imperfectness of communication channels, this delivery may have some time delays and even failures and be corrupted by environment noises. Without detectors, the plant is usually impossible to distinguish the input vector u(k)Image from that carrying the attack vector d(k)Image. In other words, the attack vector d(k)Image is usually injected into the input vector u(k)Image possibly in a transmission process, and this injection combines these two vectors into a single vector that is input to the plant. Differently from external disturbances widely studied in control system analysis and synthesis, the attack vector d(k)Image is usually not random. More precisely, it might be designed with a clear objective using some measurements of the plant output vector y(k)Image.

When d(k)0Image, the networked system described by Eqs. (11.1) and (11.2) is said to be in its normal situation, and the associated trajectories of its state and output vectors are said to be in their nominal behavior. Otherwise, the system is said to be in an abnormal situation, and the corresponding trajectories of its state and output vectors are said to be in their abnormal behavior. The objective of attack estimation is to clarify whether or not the system is in its normal situation using measurements of its output vector, whereas the objective of attack identification is to estimate the amount and positions of attackers when the system is in an abnormal situation. Generally, these two problems are dealt with through investigating characteristics of the residues of some detectors.

Taking into account that for nonlinear dynamic systems, there still does not exist a general analysis and/or synthesis method, our attention is restricted to linear systems in this chapter, in which both the function f(k,)Image of Eq. (11.1) and the function g(k,)Image of Eq. (11.2) are linear. This significantly reduces mathematical difficulties in handling the attack estimation and prevention problems, keeping the associated results significant and relevant from an engineering viewpoint. With this simplification, the input vector u(k)Image can be removed from the model, as its existence does not affect conclusions on either attack identification or attack prevention due to the linearities of the system.

In this chapter, we first discuss attack identification and prevention using static data, which has been extensively adopted in large-scale electric power networks. Afterward, relations between system observability and attack prevention are investigated. With these relations, estimation and identification of attacks are respectively dealt with, as well as the relations between system security and optimal sensor placements. Further research topics will also be briefly discussed.

To clarify basic concepts and ideas, rather than the general model of a networked system given by Eqs. (3.25) and (3.26), which has been discussed in several chapters of this book, in this chapter the state-space model of Eq. (3.1) is utilized.

11.2 The SCADA System

Very possibly, the most successful attack estimator until now is the SCADA system, which is originally designed to protect an electricity network from working in an unsafe region. In this system, the plant is assumed to be in steady state, and its state vector is estimated from its output measurements using a least squares method. Clearly, to let this system work properly, it is necessary that the number of sensors in the plant is not smaller than the dimension of its state vector. Otherwise, the output of the associated attack detector cannot be uniquely determined, which will invalidate its usefulness. Generally, the more the sensors the plant has, the better the detection performance the system has.

When the plant is linear and works in its steady state, and only output measurements are utilized in estimating its state vector, the relations between the plant outputs and states, described by Eq. (11.2), can be expressed in a much simpler way given as follows:

y ( k ) = C k x ( k ) + D k d ( k ) + w ( k ) .

Image (11.3)

In this equation, a vector w(k)Image is introduced to represent measurement noises. For simplicity, we assume that its mathematical expectation is constantly equal to zero, whereas its covariance matrix is constantly equal to the identity matrix. As pointed out before, to get a physically meaningful estimate about the plant states, it is necessary that the dimension of the output vector y(k)Image is not smaller than that of the state vector x(k)Image. This means that the output matrix CkImage has more rows than columns.

When no attacks are available, a reasonable estimate about the plant state vector, denoted xˆ(k)Image, which is extensively adopted and widely called a least squares estimate, is

x ˆ ( k ) = ( C k T C k ) 1 C k T y ( k ) .

Image (11.4)

With this estimate, the predicted plant output yˆ(k)Image and its prediction errors y˜(k)Image are respectively

y ˆ ( k ) = C k ( C k T C k ) 1 C k T y ( k ) ,

Image (11.5)

y ˜ ( k ) = y ( k ) y ˆ ( k ) = [ I C k ( C k T C k ) 1 C k T ] y ( k ) .

Image (11.6)

Obviously, to make this estimate physically meaningful, it is necessary that the matrix CkImage is of full column rank, which guarantees the existence of the inverse of CkTCkImage. This condition can be satisfied through an appropriate sensor placement in the plant.

When the measurement error vector w(k)Image has a normal distribution, we can easily prove that y˜T(k)y˜(k)Image has a X2Image distribution. Hence, an anomaly detector, which is based on a quantity of the residue of plant output measurement, can be defined as

r ( k ) = S y ( k ) , S = I C k ( C k T C k ) 1 C k T ,

Image (11.7)

which is in fact the SCADA system [6]. Obviously, this detector shares the same form of the output prediction error in the normal situation that is given by y˜(k)Image. In this system, rT(k)r(k)Image is utilized to decide whether or not there exists an attack in the networked system. When its value is greater than a prescribed threshold, it is believed that an attack exists. Otherwise, there are no attacks in the networked system [6,7].

This anomaly detector is usually efficient in detecting a single attack which caused a significant change in one of the plant output measurements. However, when a coordinated malicious attack exists in the plant and may cause simultaneous change of several plant output measurements, very possibly, this detector does not work quite well. For example, assume that there is a coordinated attack d¯(k)Image satisfying

D k d ¯ ( k ) = C k x ¯ ( k ) .

Image (11.8)

This is possible if the attackers appropriately select their attack places and strategies and coordinate their actions. Under such a situation, we have that the output of the anomaly detector satisfies the following equalities:

r ( k ) = S y ( k ) = [ I C k ( C k T C k ) 1 C k T ] × [ C k x ( k ) + D k d ¯ ( k ) + w ( k ) ] = [ I C k ( C k T C k ) 1 C k T ] × [ C k x ( k ) + w ( k ) ] + [ I C k ( C k T C k ) 1 C k T ] C k x ¯ ( k ) = [ I C k ( C k T C k ) 1 C k T ] × [ C k x ( k ) + w ( k ) ] ,

Image (11.9)

that is, no matter how large the magnitude of an element in the vector x¯(k)Image is, the output of the anomaly detector, that is, r(k)Image, does not change. This invalidates its capabilities of detecting the attacks satisfying Eq. (11.8).

Note that when the plant is linear and the input signal u(t)Image is constantly set to zero, by Eq. (11.1) its state evolutions can be expressed as

x ( k + 1 ) = A k x ( k ) + B k d ( k ) .

Image (11.10)

In the plant steady state, we have that x(k+1)=x(k)Image. Hence,

x ( k ) = [ I A k ] 1 B k d ( k ) ,

Image (11.11)

which may be very far from the steady state of the plant without an attack, which is equal to zero by the adopted assumption that u(k)0Image. In fact, when the attack of Eq. (11.8) is applied,

x ( k ) = [ I A k ] 1 B k D k C k x ¯ ( k ) ,

Image (11.12)

where DkImage stands for the pseudo-inverse of the matrix D(k)Image.1 As each element of the vector x¯(k)Image can be arbitrarily large in magnitude without being detected by the anomaly detector, this equation implies that an appropriate selection of either the matrix BkImage or the matrix DkImage, or both of them, may cause a significant change of the plant states, which are capable of deriving the plant to its unsafe region. On the other hand, through choices of attack positions in a networked system, a selection of either the matrix BkImage or the matrix DkImage can be realized in a relatively easy way, provided that the attackers have accurate information on the system model, which is mostly about the system output matrix CkImage in this case.

These conclusions are valid even if the plant is not in its steady state. In addition, numerical studies show that the attack in the form (11.8), which is extensively called stealthy attack, is usually sparse [6,7].

To analyze the security of a networked system, some security indices have also been introduced. When the plant is also time invariant, the following index αjImage is suggested in [7] to measure the security degree of the plant jth output measurement:

α j = def min x R n C x 0 subject to  C ( j , : ) x 0 ,

Image (11.13)

where to make the expressions more concise, the temporal variable k has been omitted from the system matrices. Moreover, C(j,:)Image stands for the jth row vector of the output matrix C.

From its definition it is clear that when an attacker is intended to change the plant jth output measurement without being detected by the SCADA system, αjImage is in fact the minimum number of plant output measurements that he/she needs compromise. The smaller this index, the easier the jth plant output measurement to be attacked by a stealthy attack, that is, an attack that cannot be detected in principle. This means that knowledge about this security index for all the plant output measurements may provide some useful information about the security vulnerabilities of the networked system, which may be helpful in protecting the networked system with restricted resources.

Although the security index is significant from an engineering viewpoint, the associated optimization problem itself cannot be easily solved. In fact, this minimization problem has been proven to be NP-hard, and various modifications have been suggested as an alternative. One example is to replace the vector 0-norm with the vector 1-norm.

11.3 Attack Prevention and System Transmission Zeros

The previous section illustrates briefly the SCADA system, which uses only plant output measurements to detect whether or not there are attacks in the plant. The attack detection method is quite simple, but it reveals almost all essential issues in attack preventions: an attacker usually intends to significantly violate or even destroy a system in a secrete way, whereas a detector should be efficient in revealing all the attacks as soon as possible. This implies that there may exist close relations between attack preventions and system observability. In this section, we clarify these relations under the situation in which both the detector and the attacker have accurate knowledge about the plant dynamics.

As general conclusions are still not available for nonlinear time-varying systems, our attention is also restricted to linear and time-invariant systems in this section. We assume that an attacker is intended to alter the states of a plant through injecting some exogenous inputs on the basis of the system matrices of the plant at each time. In addition, the attacker is assumed to have unrestricted computation capabilities. An attacker with these characteristics is called an omniscient attacker. Under these assumptions, stealthy colluding attacks are closely related to plant transmission zeros.

Assume that the dynamics of a networked system with possible attacks is described as

x ( k + 1 ) = A x ( k ) + B d ( k ) ,

Image (11.14)

y ( k ) = C x ( k ) + D d ( k ) .

Image (11.15)

As in the previous section, here d(k)Image represents an attack disturbance. Once again, we assume that x(k)RnImage, y(k)RmImage, and d(k)RqImage. Noting that the plant is assumed to be linear, it is not necessary to include control signals in the adopted model.

To simplify discussions, we assume throughout the rest of this chapter that the above state space model is a minimal realization, that is, it is both controllable and observable. Note that if the system is not controllable, then a decomposition can divide its state space into the direct sum of controllable and uncontrollable subspaces, and the system state vector can be manipulated by an attacker only when it is in the controllable subspace. On the other hand, if the system is unobservable, then the system state space can be decomposed into the direct sum of observable and unobservable subspaces. As the part of a state vector belonging to the unobservable subspace does not have any influence on the plant output vector, variations of this part cannot be detected by the plant output vectors. That is, under such a situation, there always exist attacks that cannot be detected and therefore cannot be identified by monitoring only these plant outputs. This observation means that the aforementioned minimal realization assumption is reasonable.

Another assumption adopted in the rest of this chapter is that the matrix [BD]Image is of full column rank, which is constituted from the system input matrix B and its direct feedthrough matrix D. This is necessary for the identifiability of attacks, noting that if this matrix is not of full column rank, then different attacks may cause the same value of the vector [BD]d(k)Image. More precisely, for an arbitrary d¯(k)Null([BD])Image, it is obvious from the definition of the null space of a matrix that

[ B D ] [ d ( k ) + d ¯ ( k ) ] = [ B D ] d ( k ) .

Image

In addition, when the matrix [BD]Image is not of full column rank, Null([BD])Image has at least one nonzero vector as its element. Under such a situation, it is clear from Eqs. (11.14) and (11.15) that these different attacks lead to the same trajectory of the system state vector and to the same trajectory of the system output vector. Obviously, this is not an appreciative property in attack detection and/or attack identification.

Similarly to those in the SCADA system, an attack detector, which is sometimes also called an attack monitor, exploits only plant dynamics and output measurements to reveal the existence of attacks and identify their positions. Hence, an attack cannot be detected in principle if the associated plant output measurements are consistent with those without any attacks. On the contrary, if an attack causes a plant output measurement that has some characteristics different from those in regular/normal situations, some methods may be developed to detect this attack. Note that the output of a plant depends on both its initial conditions and external inputs. To clarify this dependence, rather than y(k)Image, y(k,x(0),d(j)|j=0k)Image is used more often in the following discussions. Here, x(0)Image belongs to the set RnImage and represents the value of the plant state vector at the time instant k=0Image.

Similarly, x(k,x(0),d(j)|j=0k)Image is sometimes adopted to indicate the dependence of the state vector on its initial values and on the input vector sequence d(k)|k=0Image.

Definition 11.1

(Undetectable attack) Concerning the system described by Eqs. (11.14) and (11.15), an attack d(k)|k=0Image is undetectable if for each initial state vector x(0)RnImage, there exists at least one other initial state vector x¯(0)RnImage such that y(k,x¯(0),d(j)|j=0k)=y(k,x(0),0)Image at each k=0,1,2,Image.

From an engineering point of view, this definition means that a detector cannot be constructed if the plant outputs associated with some attacks are completely the same as those due to modifications of plant initial conditions. One of the basic motivations behind this definition is that plant initial conditions are usually not exactly known in many applications, which makes the mapping from a plant input series to a plant output series not bijective and therefore leaves opportunities for an attacker to inject destructive disturbances.

In addition, to detect whether or not there exists an attack in a networked system, it is usually necessary for a detector to determine where the attack is from if it exists. This leads to an attack identification problem and requires to construct a detector that has capabilities of identifying attack locations. As attacks in a networked system are usually sparse, only a few elements of the attack vector d(k)Image are usually not constantly equal to zero. Hence, some upper bounds can be put on the number of colluding attackers. Here, we assume that at most K attackers collude in one attack.

Definition 11.2

(Unidentifiable attack) Concerning the system described by Eqs. (11.14) and (11.15), an attack vector sequence d(k)|k=0Image with K elements not constantly zero is unidentifiable if for each initial state vector x(0)RnImage, there exist at least one other initial state vector x¯(0)RnImage and one other attack vector sequence d¯(k)|k=0Image with K¯Image elements not constantly zero, where 0K¯KImage, such that at every time instant k=0,1,2,Image, the plant output satisfies y(k,x¯(0),d¯(j)|j=0k)=y(k,x(0),d(j)|j=0k)Image.

Similarly to an undetectable attack, an unidentifiable attack cannot be identified because another attack can generate completely the same plant output, which makes plant outputs alone not sufficiently informative in distinguishing these attacks.

Let KImage denote a set consisting of K nonzero positive integers that belong to the set {1,2,,q}Image, where q stands for the dimension of the attack vector d(k)Image. Associating with this set, define the attack set AKImage in which an entry of the attack vector d(k)Image is not constantly set to zero only if it is in the row whose number is one of the elements of the set KImage. More precisely, assume that K={α1,α2,,αK}Image with αi{1,2,,q}Image for each i=1,2,,qImage. Then,

A K = { d ( k ) | d ( i , k ) 0 , i { 1 , 2 , , q } K ; d ( i , k ) 0 , i K } ,

Image

where d(i,k)Image stands for the ith row element of the attack vector d(k)Image.

An attack set is undetectable if there exists at least one undetectable attack in that set. Similarly, an attack set is unidentifiable if there exists at least one unidentifiable attack in that set. As the attack set AKImage only provides information about the position, in which an attack is injected into the networked system, which is completely the same as that of the set KImage, sometimes the set KImage is also called as an attack set. In other words, when an attack from the attack set AKImage is injected into the networked system described by Eqs. (11.14) and (11.15), Bd(k)Image and Dd(k)Image in these equations can be respectively rewritten as BKdK(k)Image and DKdK(k)Image, where BKImage is constituted from the columns of the matrix B whose numbers are consistent with the elements of the set KImage, whereas DKImage is constituted from the columns of the matrix D whose numbers are consistent with the elements of the set KImage. In addition, dK(k)Image is a K-dimensional real-valued vector.

In [9], a different attack detection problem is formulated and investigated. It is assumed there that only sensors or actuators of a networked system are attacked, and a characterization is given for the maximum number of attacks that can be detected and corrected, which is expressed as a function of the plant state transition matrix and the plant output matrix. Although this problem formulation appears to be different from that given in Definition 11.1, they are actually closely related to each other [7].

From linearity assumption of the networked system and Definition 11.1 it is obvious that an attack d(k)|k=0Image is undetectable if and only if there exists a plant initial state vector x˜(0)RnImage such that

y ( k , x ˜ ( 0 ) , d ( j ) | j = 0 k ) 0 .

Image (11.16)

The latter is closely related to some fundamental properties of a dynamic system. In fact, Eq. (11.16) has completely the same form as that of the conditions adopted in the definition of the zero dynamics of a system. More precisely, under some weak conditions, the equality in this equation can be satisfied if and only if the attack d(k)|k=0Image only stimulates the zero dynamics of the networked system [10,11]. Due to these relations, we can establish an algebraic criterion to verify whether or not an attack set is undetectable.

Similarly, we can straightforwardly prove that an attack d(k)|k=0Image with K elements that are not constantly zero is unidentifiable if and only if there exist a plant initial state vector x˜(0)RnImage and an attack d¯(k)|k=0Image with K¯Image elements in d(k)|k=0Image that is not constantly zero and the integer K¯Image satisfying 0K¯KImage such that

y ( k , x ˜ ( 0 ) , d ( j ) d ¯ ( j ) | j = 0 k ) 0 .

Image (11.17)

Note that Eqs. (11.16) and (11.17) are quite similar in their forms. It is not hard to understand that attack identification is once again closely related to transmission zeros of a networked system, which leads to an easily verifiable algebraic condition for an unidentifiable attack set.

11.3.1 Zero Dynamics and Transmission Zeros

These discussions reveal that both attack detection and attack identification are in fact a verification problem on the existence of an initial state vector and an input sequence that make the plant output vector be constantly equal to zero, which is closely related to zero dynamics and therefore to transmission zeros of a system. To illustrate these relations, the following conclusions are first established for a discrete-time system, which are similar to the results about continuous-time systems given in [10,11].

Lemma 11.1

Let G(z)Image be the transfer function matrix with minimal realization given by Eqs. (11.14) and (11.15). If the system input vector process d(k)Image is of the form d(k)=λkd(0)Image for each k0Image and the system initial conditions satisfy x(0)=(λIA)1Bd(0)Image, in which λ is a constant scalar not equal to any eigenvalue of the matrix A, and d(0)Image is a constant vector with compatible dimension. Then for an arbitrary integer k0Image, the system output vector process y(k)Image can be expressed as

y ( k ) = λ k G ( λ ) d ( 0 ) .

Image (11.18)

Proof

When λ is not an eigenvalue of the matrix A, the matrix λIAImage is invertible, which means that the condition x(0)=(λIA)1Bd(0)Image is well defined. Note that when a system has a minimal realization of Eqs. (11.14) and (11.15), the ZImage-transformation of its output vector process y(k)Image can be expressed as

y ( z ) = C ( z I A ) 1 [ x ( 0 ) + B d ( z ) ] + D d ( z ) .

Image (11.19)

Hence, when d(k)=λkd(0)Image for each k0Image, we have that

y ( z ) = C ( z I A ) 1 { x ( 0 ) + B [ ( z λ ) 1 d ( 0 ) ] } + D [ ( z λ ) 1 d ( 0 ) ] = [ C ( λ I A ) 1 B + D ] [ ( z λ ) 1 d ( 0 ) ] + C ( z I A ) 1 { x ( 0 ) + B [ ( z λ ) 1 d ( 0 ) ] } C ( λ I A ) 1 B [ ( z λ ) 1 d ( 0 ) ] = G ( λ ) [ ( z λ ) 1 d ( 0 ) ] + C ( z I A ) 1 { x ( 0 ) + [ I ( z I A ) ( λ I A ) 1 ] B [ ( z λ ) 1 d ( 0 ) ] } = G ( λ ) [ ( z λ ) 1 d ( 0 ) ] + C ( z I A ) 1 { x ( 0 ) ( λ I A ) 1 B d ( 0 ) } .

Image (11.20)

Therefore, the condition x(0)=(λIA)1Bd(0)Image leads to the following equality:

y ( z ) = G ( λ ) [ ( z λ ) 1 d ( 0 ) ]

Image (11.21)

Recall that λ is a constant scalar. The proof can now be completed through taking the inverse ZImage-transform of the vector-valued function y(z)Image. □

From Lemma 11.1 it is clear that if d(0)Image belongs to the null space of G(λ)Image, then when the system initial state vector is set as x(0)=(λIA)1Bd(0)Image, the plant output vector y(k)Image is constantly equal to zero for each k=0,1,2,Image. However, it is worth mentioning that when λ is a real scalar and d(0)Image is a real-valued vector, both λkd(0)Image with any k0Image and (λIA)1Bd(0)Image are real-valued vectors, which can be realized in principle in actual engineering problems, noting that both the state transition matrix A and the system input matrix B are real valued. However, when either λ is a complex scalar or d(0)Image is a complex-valued vector, the associated λkd(0)Image with k0Image and/or (λIA)1Bd(0)Image may also be complex valued. This may make the associated values not be realizable by any actual system input vector process and/or any system initial states.

When λ is equal to an eigenvalue of the state transition matrix A, the discussions become more complicated. However, with the results of Theorem 2.5, similar results can still be obtained. On the other hand, using a concept called system matrix, similar results can also be established without distinguishing whether or not λ is an eigenvalue of the state transition matrix A, which are given in the following Lemma 11.2.

To guarantee the existence of a nontrivial input vector process d(k)Image and a nontrivial initial state vector x(0)Image such that all the conditions of Lemma 11.1 are satisfied and the plant output vector process is constantly equal to zero, it is necessary that the matrix G(λ)Image is not of full column rank. When the system is of single input and single output, it is obvious that λ must be one of the zeros of its transfer function, as the minimal realization assumption does not permit the existence of a zero in the transfer function that is equal to one of its poles. The situation becomes much more complicated when the system is of multiple inputs and multiple outputs. In this case, even if its state space model is a minimal realization, there still exists possibility that some of its zeros and some of its poles share the same value. To clarify the mathematical descriptions of a system zero and its relations to the zero dynamics of the system, the following concepts are required.

Definition 11.3

A vector x(0)RnImage is called weakly unobservable if there exists an input vector sequence d(k)|k=0Image, such that the corresponding output vector sequence y(k)Image is constantly equal to zero, that is, y(k,x(0),d(j)|j=0k1)=0Image for each k=0,1,2,Image.

Denote by VImage the set consisting of all weakly unobservable vectors of a system. From the linearity of the system we can straightforwardly proven that this set is actually a subspace of RnImage. Due to this reason, the set VImage is sometimes called the weakly unobservable subspace of the system [11,12]. In fact, if we define a subspace VkImage with k=0,1,2,Image, as

V k = { x ( 0 ) | x ( 0 ) R n , there is an input sequence d ( k ) | k = 0 , such that y ( i ) = 0 for each i = 0 , 1 , , k 1 } ,

Image

then it is obvious from the definition of this subspace sequence that

V 0 V 1 V 2 V 3 .

Image

In addition, we can directly prove that there exists a nonnegative integer 0knImage such that

V = V k = V k + i for all i 0 .

Image

Let FRq×nImage and LRq×sImage satisfy

( A + B F ) V V , N u l l ( C + D F ) V , S p a n ( L ) = N u l l ( D ) ( B V ) .

Image

The existence of these matrices is shown in [11]. Then we can prove that, for each x(0)VImage, an input vector sequence d(k)|k=0Image that satisfies

y ( k , x ( 0 ) , d ( j ) | j = 0 k 1 ) = 0 for each k = 0 , 1 , 2 , ,

Image

can be parameterized so that, for an arbitrary k{0,1,2,}Image,

d ( k ) = F x ( k ) + L w ( k ) ,

Image

where w(k)Image is an arbitrary real vector sequence with compatible dimension. Moreover, we can also prove that if x(0)VImage and d(i)|i=0Image with d(i)RqImage is an associated input vector sequence such that y(k,x(0),d(i)|i=0k1)=0Image for each k=1,2,Image, then the associated state vector sequence x(k)|k=0Image, denoted x(k,x(0),d(i)|i=0k1)|k=0Image, satisfies x(k,x(0),d(i)|i=0k1)VImage for each k=1,2,Image.

Some other properties of this weakly unobservable subspace can be found, for example, in [11,12].

The concept of zero dynamics of a system is established on the basis of its weakly unobservable subspace.

Definition 11.4

When the initial state vector of the system described by Eqs. (11.14) and (11.15) is restricted to belong to its weakly unobservable subspace VImage, its associated input–output relation is called its zero dynamics.

The zero dynamics of a system is also connected to its weakly unobservable subspace through its system matrix, a concept suggested originally by Rosenbrock [11,13].

Lemma 11.2

Concerning the system described by Eqs. (11.14) and (11.15), assume that the matrix [BD]Image is of full column rank. Define its system matrix

P ( z ) = [ z I A B C D ] .

Image (11.22)

Suppose that there exist a vector x(0)RnImage and a vector d(0)RqImage such that at least one of these two vectors is not equal to zero, and at a real value of the complex variable z, denoted λ, the following equality is satisfied:

P ( λ ) [ x ( 0 ) d ( 0 ) ] = 0 .

Image (11.23)

Then for each k=1,2,Image, the input vector sequence d(k)=λkd(0)Image satisfies both y(k,x(0),d(i)|i=0k)=0Image and d(k)RqImage.

Proof

Note that the system defined by Eqs. (11.14) and (11.15) is well defined. This means that for a fixed initial state vector x(0)Image and each input vector sequence d(k)|k=0Image, there are only one state vector sequence x(k)|k=0Image and only one output vector sequence y(k)|k=0Image that satisfy these two equations simultaneously.

When condition (11.23) is satisfied, define x(k)Image as x(k)=λkx(0)Image for every k=0,1,2,Image. Then, we can straightforwardly prove that

x ( 1 ) = λ x ( 0 ) = A x ( 0 ) + B d ( 0 ) .

Image

Assume now that, for an integer i with i0Image, Eq. (11.14) is satisfied by x(i)=λix(0)Image and d(i)=λid(0)Image. Then, we can straightforwardly prove that

x ( i + 2 ) = λ i + 2 x ( 0 ) = λ x ( i + 1 ) = λ [ A x ( i ) + B d ( i ) ] = A [ λ x ( i ) ] + B [ λ d ( i ) ] = A x ( i + 1 ) + B d ( i + 1 ) ,

Image

that is, when k=i+1Image, the aforementioned state vector process and input vector process also satisfy Eq. (11.14). Therefore, x(k)=λkx(0)Image and d(k)=λkd(0)Image satisfy this equation for every k0Image. Hence, x(k)=λkx(0)Image is the solution to this difference equation with its initial state vector x(0)Image and input vector process d(k)=λkd(0)Image.

On the other hand, note that, for each k=0,1,2,Image,

C x ( k ) + D d ( k ) = C ( λ k x ( 0 ) ) + D ( λ k d ( 0 ) ) = λ k [ C x ( 0 ) + D d ( 0 ) ] = 0 ,

Image

provided that the vectors x(0)Image and d(0)Image satisfy conditions (11.23). Therefore, y(0)=0Image, and for each k=1,2,Image,

y ( k , x ( 0 ) , d ( i ) | i = 0 k 1 ) = 0 .

Image

Now, assume that λ takes a complex value. Denote its real and imaginary parts respectively by λrImage and λjImage. Recall that the matrices A, B, C, and D and the vectors x(0)Image and d(0)Image are real valued. Satisfaction of Eq. (11.23) implies that

[ λ r I A C ] x ( 0 ) + [ B D ] d ( 0 ) = 0 , λ j x ( 0 ) = 0 .

Image

In addition, note that

[ B D ] = [ I 0 0 I ] [ B D ] ,

Image

which means that

rank ( [ B D ] ) = rank ( [ B D ] ) .

Image

If λj0Image, then it is necessary that x(0)=0Image. As the matrix [BD]Image is assumed to be of full column rank, x(0)=0Image and the first subequation in the above equation imply that the vector d(0)Image is also equal to zero. This is a contradiction with the requirement that at least one of the vectors x(0)Image and d(0)Image is not equal to zero. Hence, λj=0Image, that is, λ takes a real value.

These arguments show that if there is λ satisfying Eq. (11.23), then it is necessary that this λ is real valued. Hence, λkd(0)Image is real valued for each k=0,1,2,Image. This completes the proof. □

Clearly, a real vector satisfying Eq. (11.23) belongs to the weakly unobservable subspace VImage of the system.

From the proof of Lemma 11.2 it is obvious that if both the system initial state vector x(0)Image and its input vector sequence d(k)|k=0Image are allowed to take complex values, then the associated results are valid even when the complex variable z takes a complex value that satisfies Eq. (11.23). In this case, both x(0)Image and d(0)Image may be complex vectors, which will lead to a complex-valued sequence d(k)|k=0Image that makes the associated system output vector sequence constantly equal to zero.

On the other hand, Lemma 11.2 does not require that λ is different from every eigenvalue of the state transition matrix A, which makes its results more general than those on system zero dynamics derived from Lemma 11.1. In fact, if λ is not an eigenvalue of the state transition matrix A, then Eq. (11.23) straightforwardly leads to [C(λIA)1B+D]d(0)=0Image, which can be rewritten as G(λ)d(0)=0Image. This is completely in the same form as the condition obtained from Lemma 11.1 for the system output vector sequence constantly equal to zero. These observations imply that when λ is not an eigenvalue of the matrix A, the conditions of Lemma 11.2 are equal to those derived from Lemma 11.1. Hence, we can regard that Lemma 11.2 includes some conclusions of Lemma 11.1 as its particular case.

To guarantee the existence of a nonzero vector x(0)Image and/or a nonzero vector d(0)Image such that Eq. (11.23) is satisfied, it is necessary that the matrix P(λ)Image is not of full column rank. A complex value λ at which the system matrix P(z)Image defined by Eq. (11.22) is not of full column rank is called a transmission zero of this system, which is an extension of a zero in a transfer function of a single-input single-output system to a zero of the transfer function matrix of a multiple-input multiple-output system [10,13].

Various methods have been developed for the computation of the transmission zeros of a multiple-input multiple-output system, among which one of most widely adopted methods appears to be the approach based on the so-called McMillan–Smith form [10,11]. In addition, the proof of the lemma reveals that to construct an actually realizable system initial state vector and an actually realizable system input vector process such that the system output vector process is constantly equal to zero, real-valued transmission zeros must be considered.

Our discussions show that if a system has a real transmission zero, then there certainly exist a real-valued initial state vector and a real-valued input vector sequence that lead to a constantly zero output vector sequence. However, it is still not clear that if there is a real-valued initial state vector x(0)Image and a real-valued input vector sequence d(k)|k=0Image such that the associated output vector sequence y(k,x(0),d(j)|j=0k)Image is constantly zero, whether or not the system must have at least one real transmission zero.

To attack this problem, we need the following concept.

Definition 11.5

If for all permissible input vector sequences u1(k)|k=0Image and u2(k)|k=0Image, the satisfaction by the system output vector sequence y(k)|k=0Image of the condition

y ( k , 0 , u 1 ( j ) | j = 0 k ) = y ( k , 0 , u 2 ( j ) | j = 0 k )

Image

for all k=0,1,2,Image implies that

u 1 ( k ) = u 2 ( k )

Image

for all k=0,1,2,Image, then this system is called left invertible.

When a system is linear, it is obvious from the definition that a necessary and sufficient condition for its left invertibility is that y(k,0,u(j)|j=0k)0Image if and only if u(k)0Image. For the linear time-invariant system described by Eqs. (11.14) and (11.15) that has a minimal realization, let G(z)Image denote its transfer function matrix C(zIA)1B+DImage. It has been proven that the following statements are equivalent to each other [11]:

  • •  This system is left invertible.
  • •  There exists a rational matrix-valued function Gl(z)Image such that Gl(z)G(z)=IImage, that is, the transfer function matrix G(z)Image has a left inverse.
  • •  The system matrix P(z)Image defined in Eq. (11.22) has a rank of n+qImage for all but finitely many zCImage.
  • •  Let VImage denote the weakly unobservable subspace of the system. Then

V { B × N u l l ( D ) } = { 0 } and [ B D ] is   of   full   column   rank.

Image

Related to the concept of weak observability, there is also a concept called strong observability.

Definition 11.6

If for every permissible initial state vector x(0)Image and every feasible input vector sequence u(k)|k=0Image, the system output vector sequence y(k)|k=0Image satisfies the equality

y ( k , 0 , u ( j ) | j = 0 k ) = 0

Image

for all k=0,1,2,Image only when

x ( 0 ) = 0 ,

Image

then this system is called strongly observable.

For a linear time invariant system, the following results have been established in [11].

Lemma 11.3

A system with its state space model described by Eqs. (11.14) and (11.15) is strongly observable, if and only if at every complex value of the complex variable z,

rank ( [ z I A B C D ] ) = n + rank ( [ B D ] ) .

Image

From Definitions 11.3 and 11.6 it is obvious that a system is strongly observable if and only if its weakly unobservable subspace is constituted only from the zero vector. It has also been proven that, for the linear time-invariant system of Eqs. (11.14) and (11.15), its strong observability is equivalent to that, for each real matrix F with compatible dimension, the matrix pair (A+BF,C+DF)Image is always observable [11].

Based on these results, the following conclusions are derived, which reveal situations under which the weakly unobservable subspace of a system is not restricted only to the zero vector.

Theorem 11.1

Assume that the system described by Eqs. (11.14) and (11.15) is left invertible. Then there exists at least one nonzero initial state vector x(0)Image such that there is an input vector sequence d(k)|k=0Image that makes the system output vector y(k,x(0),d(j)|j=0k)Image constantly equal zero if and only if it has a transmission zero.

Proof

Assume that the system has a real transmission zero. Then by Lemma 11.2 there certainly exists a nonzero initial state vector satisfying the requirements.

Now, assume that all the transmission zeros of the system are complex. Let λ be any one of them. Then there exist vectors x(0)Image and d(0)Image such that at least one of them is not equal to zero and the following equality is satisfied:

[ λ I A B C D ] [ x ( 0 ) d ( 0 ) ] = 0 .

Image (11.24)

Recall that all the system parameter matrices A, B, C, and D are real. Taking the conjugates of both sides of this equation, we obtain the following equality:

[ λ I A B C D ] [ x ( 0 ) d ( 0 ) ] = [ λ ¯ I A B C D ] [ x ¯ ( 0 ) d ¯ ( 0 ) ] = 0 .

Image (11.25)

As col{x(0),d(0)}Image does not equal zero, it is obvious that the vector col{x¯(0),d¯(0)}Image is not a zero vector either. These imply that λ¯0Image is also a transmission zero of the system.

Define d(k)Image as d(k)=λkd(0)Image, k=0,1,2,Image. Let d(z)Image and d(z)Image denote respectively the ZImage-transformations of the sequences d(k)|k=0)Image and d¯(k)|k=0)Image. Moreover, let y(z)Image and y(z)Image denote respectively the ZImage-transformations of the output vector sequences y(k,x(0),d(j)|j=0k)Image and y(k,x¯(0),d¯(j)|j=0k)Image. According to the proof of Lemma 11.2, we have that

y ( k , x ( 0 ) , d ( j ) | j = 0 k 1 ) = 0 , y ( k , x ¯ ( 0 ) , d ¯ ( j ) | j = 0 k 1 ) = 0 for all k = 0 , 1 , 2 , .

Image (11.26)

Define x(0)Image and d(k)Image further as

x ( 0 ) = x ( 0 ) + x ¯ ( 0 ) , d ( k ) = d ( k ) + d ¯ ( k ) for all k = 0 , 1 , 2 , .

Image

Then, both x(0)Image and d(k)Image with k{0,1,2,}Image are real valued and therefore can be realized in principle.

Let y(z)Image denote the ZImage-transformation of the output vector sequence associated with the system initial state vector x(0)Image and the input vector sequence d(k)|k=0Image. Then by Eq. (11.19) we have that

y ( z ) = C ( z I A ) 1 [ x ( 0 ) + B d ( z ) ] + D d ( z ) = C ( z I A ) 1 { [ x ( 0 ) + x ¯ ( 0 ) ] + B [ d ( z ) + d ( z ) ] } + D [ d ( z ) + d ( z ) ] = { C ( z I A ) 1 [ x ( 0 ) + B d ( z ) ] + D d ( z ) } + { C ( z I A ) 1 [ x ( 0 ) + B d ( z ) ] + D d ( z ) } = y ( z ) + y ( z ) .

Image (11.27)

We can therefore declare from Eq. (11.26) that, for each k=0,1,2,Image,

y ( k , x ( 0 ) , d ( j ) | j = 0 k ) = 0 .

Image (11.28)

On the contrary, assume that the system described by Eqs. (11.14) and (11.15) does not have a transmission zero but there is a nonzero initial state vector x(0)Image such that there exists an input sequence d(k)|k=0Image that makes y(k,x(0),d(j)|j=0k)=0Image for every k=0,1,2,Image. Then, for arbitrary λCImage, the matrix

[ λ I A B C D ]

Image

is of full column rank. Hence

rank ( [ λ I A B C D ] ) = n + q = n + rank ( [ B D ] )

Image

By Lemma 11.3 this system is strongly observable. Hence, if there is an input sequence d(k)|k=0Image that makes y(k,x(0),d(j)|j=0k)=0Image for every k=0,1,2,Image, then x(0)=0Image. This contradicts the assumption x(0)0Image. Therefore, the system must have some transmission zeros.

This completes the proof. □

Different from Lemma 11.2, the transmission zero in the theorem is not required to be real. On the other hand, both the initial state vector and the input vector sequence are required to be real.

11.3.2 Attack Prevention

From the results in the previous subsection it is clear that a linear time-invariant system is left invertible if and only if its transfer function matrix is of full normal column rank. This observation leads further to the following conclusions.

Corollary 11.1

If an LTI system is not left invertible, then it is certainly not attack detectable/identifiable.

Proof

Assume that a transfer function matrix G(z)Image is not of FNCR. Then there exists a nonzero vector α such that, for each λCImage, the following equality is satisfied:

G ( z ) α = 0 .

Image (11.29)

Let (A,B,C,D)Image be a real matrix quadruplex that satisfies C(zIA)1B+D=G(z)Image and associates with a state space model having a minimal realization. Take an arbitrary real scalar λ that is not an eigenvalue of the matrix A and denote the associated real and imaginary parts of the vector α by αrImage and αjImage, respectively. Then, the Eq. (11.29) means that the following two equalities are satisfied:

[ C ( λ I A ) 1 B + D ] α r = 0 , [ C ( λ I A ) 1 B + D ] α j = 0 .

Image (11.30)

Note that α0Image is equivalent to [αrαj]0Image, that is, the assumption α0Image implies that at least one of αrImage and αjImage is not equal to zero.

Assume that αr0Image. Then, the first equality of Eq. (11.30) can be equivalently rewritten as

[ λ I A B C D ] [ ( λ I A ) 1 B α r α r ] = 0 .

Image (11.31)

Note that both vectors (λIA)1BαrImage and αrImage are real. Moreover, the vector [(λIA)1Bαrαr]Image is not a zero vector. By Lemma 11.2 there always exist a nonzero x(0)RnImage and an input vector sequence d(k)RqImage, not constantly equal to zero, such that the output vector process of the LTI system, which can be expressed as y(k,x(0),d(i)|i=0k)Image, is constantly equal to zero for each k=0,1,Image. Hence the system is not attack detectable/identifiable.

When αj0Image, the same arguments show that the system is not attack detectable/identifiable.

This completes proof. □

These observations reveal that the problem of attack preventions is not trivial only when the associated system is left invertible.

With these preparations on system zero dynamics, we discuss now conditions respectively for the detectability and identifiability of attacks in a networked system.

Theorem 11.2

Concerning the system described by Eqs. (11.14) and (11.15), an attack set KImage is undetectable if and only if there exists a number λ such that the system matrix

P d ( λ ) = [ λ I A B K C D K ]

Image (11.32)

is not of full column rank.

Proof

Assume that the matrix Pd(λ)Image is not of full column rank at a particular real value of the complex variable λ, denoted λ0Image. Then, there exist vectors x(0)Image and d(0)Image such that at least one of these two vectors is not a zero vector and

( λ 0 I A ) x ( 0 ) B K d ( 0 ) = 0 ,

Image (11.33)

C x ( 0 ) + D K d ( 0 ) = 0 .

Image (11.34)

As all the involved matrices, that is, the state transition matrix A, the input matrix BKImage, the output matrix C, and the direct feedthrough matrix DKImage, and the scalar λ0Image are real valued, we can assume, without any loss of generality, that both vectors x(0)Image and d(0)Image are real valued. In fact, if they are not real valued, then denote their real and imaginary parts respectively by xr(0)Image, dr(0)Image, xj(0)Image, and dj(0)Image. Then Eqs. (11.33) and (11.34) imply that

( λ 0 I A ) x r ( 0 ) B K d r ( 0 ) = 0 , ( λ 0 I A ) x j ( 0 ) B K d j ( 0 ) = 0 ,

Image (11.35)

C x r ( 0 ) + D K d r ( 0 ) = 0 , C x j ( 0 ) + D K d j ( 0 ) = 0 .

Image (11.36)

Note that by their definitions all the vectors xr(0)Image, dr(0)Image, xj(0)Image, and dj(0)Image are real valued. Moreover, at least one of them is not equal to zero. Reasonability of the adopted assumption immediately follows.

We now can declare the undetectability of the attack set KImage through a direct application of Lemma 11.2.

Assume now that the system matrix Pd(λ)Image is not of full column rank at a particular complex value λ0Image. If the system is not left invertible, then according to the aforementioned arguments, we can declare that the system is not detectable of the attack set KImage.

Now, assume that the system is left invertible. Then from Theorem 11.1 we can declare that the attack set KImage is not detectable.

Assume that the attack set KImage is detectable. Then, by Corollary 11.1 its transfer function matrix must be of full normal column rank. From this observation and the definition of attack detectability, this system cannot have any transmission zero.

This completes proof. □

From Eqs. (11.33) and (11.34) we can also see that for the existence of an undetectable attack, it is necessary that the cardinality of the attack set is sufficiently large [7]. On the other hand, it is possible to improve attack detectability of a networked system through introducing state feedback into the system. The motivations are that a state feedback can modify transmission zeros of a system and the feedback gain can be designed with objectives that are unknown to an attacker.

Similarly, we have the following criterion for attack identifiability of the networked system described by Eqs. (11.14) and (11.15).

Theorem 11.3

Concerning the system described by Eqs. (11.14) and (11.15), an attack set KImage is unidentifiable, if and only if there exist an attack set RImage with its cardinality not greater than that of the attack set KImage and a number λ such that the matrix

P i ( λ ) = [ λ I A B K B R C D K D R ]

Image (11.37)

is not of full column rank.

Proof

Assume that at a particular real value of the complex variable λ, denoted λ0Image, the matrix-valued function Pi(λ)Image is not of full column rank. Then there exist vectors x(0)Image, dK(0)Image, and dR(0)Image such that at least one of these three vectors is not a zero vector, and

( λ 0 I A ) x ( 0 ) B K d K ( 0 ) B R d R ( 0 ) = 0 ,

Image (11.38)

C x ( 0 ) + D K d K ( 0 ) + D R d R ( 0 ) = 0 .

Image (11.39)

Using the same arguments as in the proof of Theorem 11.2, we can assume, without loss of any generality, that the vectors x(0)Image, dK(0)Image, and dR(0)Image are real valued. Then by Lemma 11.2 there exist an initial state vector x(0)RnImage and an input vector sequence u(k)RK+RImage, k=0,1,Image, such that the output vector sequence y(k)Image of the following system is constantly equal to zero.

x ( k + 1 ) = A x ( k ) + [ B K B R ] u ( k ) ,

Image (11.40)

y ( k ) = C x ( k ) + [ D K D R ] u ( k ) .

Image (11.41)

Moreover, at least either the initial state vector x(0)Image is not a zero vector, or the input vector sequence u(k)|k=0Image is not constantly equal to zero. Note that both matrices [BKDK]Image and [BRDR]Image are of full column rank. Direct algebraic manipulations show that, for this plant initial state vector x˜(0)RnImage, there exist an attack d(k)|k=0Image with K elements in d(k)|k=0Image not constantly zero, an attack d¯(k)|k=0Image with K¯Image elements in d(k)|k=0Image not constantly zero, and the integer K¯Image satisfying 0K¯KImage such that

y ( k , x ˜ ( 0 ) , d ( j ) d ¯ ( j ) | j = 0 k ) 0 , | | d ( k ) | k = 0 d ¯ ( k ) | k = 0 | | 0 ,

Image

that is, the attack set KImage is not identifiable.

By the same token as that adopted in the proof of Theorem 11.2, we can prove that even if the aforementioned λ0Image takes a particular complex value, the system is also unidentifiable with respect to the attack set KImage.

This completes the proof. □

On the basis of geometric linear system theories, some graph theoretic conditions have also been established for the existence of an undetectable attack set in a networked system and for the existence of an unidentifiable attack set in a networked system. These conditions depend only on system structures and are valid for almost all system parameters when the networked system has a compatible structure. A system property with these characteristics is extensively called generic and has the advantage of strong robustness against parametric modeling errors [14]. When the initial state vector of a networked system is known, it is proven that if the attack-state-output graph is sufficiently connected, then undetectable attacks do not exist in almost all structurally compatible networked systems. When the initial state vector is not known for a networked system, the criterion becomes more complicated, and left invertibility of the system may be required. Detailed discussions can be found, for example, in [7,8].

11.4 Detection of Attacks

In the above section, we developed some criteria to verify whether or not an attack can be detected or identified. These criteria are closely related to the transmission zeros of a networked system and can be verified in principle. When the scale of the system is large, however, computations of its transmission zeros may be computationally prohibitive and/or numerically unstable. This means that further efforts are required to develop a computationally attractive method for verifying attack detectability and identifiability. To achieve this objective, the methods given in Chapter 3 may be helpful, in which structure information is explicitly and efficiently utilized in the verification of controllability and observability of a large-scale networked system, noting that both the system matrix Pd(λ)Image of Eq. (11.32) and the system matrix Pi(λ)Image of Eq. (11.37) have a similar form of the matrix-valued polynomial M(λ)Image of Eq. (3.29). In this section, we investigate how to detect an attack using a centralized observer.

The design of an attack detector consists of a discrete-time modified Luenberger observer, which is sometimes also called a residual filter, with its input being the plant output measurements y(k)|k=0Image and its output being a residual signal r(k)|k=0Image:

z ( k + 1 ) = A z ( k ) L [ y ( k ) C z ( k ) ] ,

Image (11.42)

r ( k ) = C z ( k ) y ( k ) ,

Image (11.43)

where the matrix L is usually called the gain of the observer, which is selected to make the matrix A+LCImage stable, that is, all its eigenvalues have a magnitude smaller than 1. When the matrix pair (A,C)Image is observable, the existence of a desirable gain matrix L is always guaranteed. The residue signal r(k)Image is used to detect the existence of an attack. Ideally, this signal is constantly equal to zero if and only if there do not exist any attacks in the networked system. As the initial state vector of the observer also affects the residual signal, these expectations cannot be satisfied in general, and some modifications are required to develop a practically meaningful criterion to check the existence of an attack. However, we have the following theoretical results on the aforementioned residual filter.

Theorem 11.4

Concerning the system described by Eqs. (11.14) and (11.15), assume that an attack set KImage is detectable and the initial state vector of the system, that is, x(0)Image, is known. Assume also that the initial state vector of the residual filter described by Eqs. (11.42) and (11.43), that is, z(0)Image, is set to z(0)=x(0)Image, and the gain matrix L is selected such that the matrix A+LCImage is stable. Then, the residual signal of the filter is constantly equal to zero if and only if the attack disturbance is constant equal to zero.

Proof

Define the vector-valued function e(k)=z(k)x(k)Image. Then from Eqs. (11.14), (11.15), (11.42), and (11.43) we can straightforwardly prove that

e ( k + 1 ) = z ( k + 1 ) x ( k + 1 ) = { A z ( k ) L [ C x ( k ) + D K d K ( k ) C z ( k ) ] } { A x ( k ) + B K d K ( k ) } = ( A + L C ) e ( k ) ( L D K + B K ) d K ( k ) ,

Image (11.44)

r ( k ) = C z ( k ) [ C x ( k ) + D K d K ( k ) ] = C e ( k ) D K d K ( k ) .

Image (11.45)

When the initial state vector of the detector is set to be the same value of the initial state vector of the system, that is, z(0)=x(0)Image, we have that e(0)=0Image. Hence, when the matrix A+LCImage has all its eigenvalues being smaller than 1 in magnitude and the attack disturbances do not exist, that is, dK(k)0Image, it is obvious from Eq. (11.44) that the error vector sequence e(k)Image with k=0,1,2,Image is also constantly equal to zero. This further implies that the output vector of the detector, that is, the residual vector r(k)Image is also constantly equal to zero when the temporal variable k takes any value from the set {0,1,2,}Image.

Now assume that there exists a nonzero attack vector sequence dK(k)|k=0Image such that the output of the attack detector described by Eqs. (11.42) and (11.43) is constantly equal to zero. Then by Theorem 11.4 there must exist at least one λCImage, one vector e¯(0)Image, and one vector d¯K(0)Image such that at least one of these two vectors is not equal to zero and

[ λ I ( A + L C ) L D K + B K C D K ] [ e ¯ ( 0 ) d ¯ K ( 0 ) ] = 0 .

Image (11.46)

Note that

[ λ I ( A + L C ) L D K + B K C D K ] = [ I L 0 I ] [ λ I A B K C D K ] .

Image

Note also that the matrix [IL0I]Image is invertible for every real-valued matrix L with compatible dimension. It is obvious that the satisfaction of Eq. (11.46) is equivalent to the satisfaction of the equation

[ λ I A B K C D K ] [ e ¯ ( 0 ) d ¯ K ( 0 ) ] = 0 ,

Image (11.47)

which contradicts the assumption that the attack set KImage is detectable. This completes the proof. □

Although the results of Theorem 11.4 are promising, the system initial state vector x(0)Image is usually not known exactly. Under such a situation, the condition z(0)=x(0)Image can generally not be satisfied properly for the initial state vector of the residual filter, and z(0)Image may be selected with some arbitrariness. This means that even if there do not exist attacks in the networked system, the residual signal r(k)Image may not be constantly equal to zero. However, it can be proven that when the networked system has not been attacked, the residual signal r(t)Image asymptotically converges to zero with the increment of the temporal variable k. On the other hand, measurement errors and external random disturbances are usually unavoidable in actual networked systems, which means that even if the conditions of Theorem 11.4 are perfectly satisfied, the residual signal r(k)Image may not be constantly equal to zero and some statistical hypothesis verification methodologies appear necessary for the determination of whether or not there are attacks in the networked system. In addition to these, parametric uncertainties and unmodeled dynamics usually exist in the model of Eqs. (11.14) and (11.15). Hence, in actual implementations of attack detections, in addition to construct a stable residual filter, the gain matrix L should be selected to reduce sensitivities of the residual signal r(k)Image to modeling errors while keeping it satisfactorily sensitive to attacks. Furthermore, for a large-scale networked system, it is usually more attractive to implement an attack detector in a distributed way using only local subsystem output measurements. Detailed discussions on these issues are given in various literature, for example, [9], [8], and [7].

11.5 Identification of Attacks

In actual applications, it is usually not only necessary to know whether or not there exists an attack in a networked system, but also essential to know where the attack is from when it exists. Differently from attack detection, attack identification is computationally much more difficult due to its combinatorial characteristics, although their essential treatments are the same. In fact, it has been proven in [8] that attack identifications are generally NP hard with respect to the number of colluding attackers. More precisely, the identification of an attack from the attack set KImage usually requires a combinatorial procedure, noting that even when the number q of attackers is known, the actual attack is only one of the (qK)Image possible attacks. To identify this attack, each possibility must be considered, and for each possible attack, a residual filter needs be designed. When the attack set is identifiable, there exists one and only one residual filter whose outputs are constantly equal to zero. It is this residual filter that indicates the actual attack.

Unlike the residual filter in attack detection, the model of Eqs. (11.14) and (11.15) is not straightforwardly utilized in constructing the residual filters in attack identification. Usually, the design of these residual filters consists of three steps. First, a transformation is applied to the plant output vector to construct a vector without any attacks. Second, a state transformation is performed to divide the plant state vector into two parts, one affected by attack disturbances and the other is free from attack disturbances. The third step is to construct a residual filter with the transformed plant output vector and the transformed plant state vector that are not affected by attack disturbances, which is completely the same as that in the construction procedure for attack detections.

Now, we discuss the first step, in which the attack identification problem for the networked system described by Eqs. (11.14) and (11.15) is converted to that of a modified system in which the associated system output vector is not attacked.

For this purpose, let DKImage denote the pseudo-inverse of the matrix DKImage. Define an auxiliary networked system as

x ( k + 1 ) = [ A B K D K C ] x ( k ) + B K [ I D K D K ] d ( k ) ,

Image (11.48)

y ( k ) = [ I D K D K ] C x ( k ) .

Image (11.49)

Then, we have the following conclusions.

Lemma 11.4

Let the networked system be described by Eqs. (11.14) and (11.15). The attack set KImage is identifiable if and only if this attack set is identifiable for the networked system described by Eqs. (11.48) and (11.49).

Proof

According to Theorem 11.3, if the attack set KImage is identifiable for the networked system described by Eqs. (11.14) and (11.15), then for each attack set RImage with its cardinality not greater than that of the attack set KImage, the following system matrix Pi(λ)Image is of full column rank at each complex number λ:

P i ( λ ) = [ λ I A B K B R C D K D R ] .

Image (11.50)

It is equivalent to that the matrix P¯i(λ)Image defined as

P ¯ i ( λ ) = [ λ I A B K B R C D K D R C D K D R ]

Image (11.51)

is of full column rank at each complex number λ.

Multiplying both sides of the above equation from the left by diag{I,DKDK,IDKDK}Image, we have that

[ I D K D K I D K D K ] P ¯ i ( λ ) = [ λ I A B K B R D K D K C D K D K D K D R [ I D K D K ] C 0 [ I D K D K ] D R ] .

Image (11.52)

Note that

[ I B K D K 0 0 0 I 0 I 0 ] [ λ I A B K B R D K D K C D K D K D K D R [ I D K D K ] C 0 [ I D K D K ] D R ] = [ λ I A + B K D K C B K ( I D K D K ) B R + B K D K D R [ I D K D K ] C 0 [ I D K D K ] D R D K D K C D K D K D K D R ] .

Image (11.53)

Moreover, the matrix

[ I B K D K 0 0 0 I 0 I 0 ]

Image

is always invertible, and it is obvious that the above equations mean that the matrix-valued function

[ λ I A + B K D K C B K ( I D K D K ) B R + B K D K D R [ I D K D K ] C 0 [ I D K D K ] D R ]

Image

is also of full column rank at each complex λ. It can therefore be concluded from Theorem 11.3 that the attack set KImage is also identifiable for the networked system described by Eqs. (11.48) and (11.49).

On the contrary, if the attack set KImage is identifiable for the networked system described by Eqs. (11.48) and (11.49), then similar arguments show that this attack set is also identifiable for the networked system described by Eqs. (11.14) and (11.15).

This completes the proof. □

To construct a residual signal that is able to identify an attack, the state space model of Eqs. (11.48) and (11.49) is further transformed. To develop this transformation, however, some concepts and results are needed from geometric control theory, which can be found, for example, in [11,12].

Definition 11.7

Assume that the dynamics of a discrete-time and linear time-invariant system can be described by the following state space model:

x ( k + 1 ) = A x ( k ) + B u ( k ) , y ( k ) = C x ( k ) ,

Image

where x(k)RnImage, y(k)RpImage, and u(k)RqImage denote respectively the state vector, input vector, and output vector of the system. Moreover, assume that SImage is a subspace of RnImage. If

A ( S N u l l ( C ) ) S ,

Image

then this subspace is said to be (A,Null(C))Image-conditioned invariant.

A conditioned invariant subspace is closely related to estimator designs. A well-known result is that the smallest (A,Null(C))Image-conditioned invariant subspace that contains the subspace Span(B)Image as its subset is the largest subspace in the state space RnImage of the system, which can be estimated in the presence of an unknown input sequence u(k)|k=0Image. This conclusion is obviously of great significance in engineering, as it clarifies situations for the existence of a state estimator with unknown inputs. Another well-known result is that the null space of the observability matrix, that is,

N u l l ( [ C T A T C T ( A T ) 2 C T ( A T ) n 1 C T ] T ) ,

Image

which gives all the initial state vectors that result in a zero system output when there do not exist any external inputs are an (A,Null(C))Image-conditioned invariant subspace.

It has been proven that for each (A,Null(C))Image-conditioned invariant subspace SImage, there exists at least one real matrix L such that

( A + L C ) S S ,

Image

that is, an output injection matrix that renders this subspace invariant.

To apply these concepts and results to the system described by Eqs. (11.48) and (11.49), let S¯KImage denote the smallest (ABKDKC,Null([IDKDK]C))Image-conditioned invariant subspace that contains Span(BK[IDKDK])Image as its subspace. Moreover, let L be a matrix that satisfies

S ¯ K ( A B K D K C + L [ I D K D K ] C ) S ¯ K

Image

Existence of a desirable matrix L is guaranteed by the properties of the subspace S¯KImage. Furthermore, let T be a nonsingular square matrix satisfying

T [ A B K D K C + L C ] T 1 = [ A ¯ 11 A ¯ 12 0 A ¯ 22 ] , T B K [ I D K D K ] = [ B ¯ K 0 ] ,

Image (11.54)

[ I D K D K ] C T 1 = [ C ¯ 1 C ¯ 2 ]

Image (11.55)

where B¯KImage is a matrix of full row rank. We can prove using geometric control theories that a desirable matrix T always exists [8,11,12]. As a matter of fact, the matrix T can be constituted from a basis of the subspace S¯KImage and a basis of the quotient subspace RnS¯KImage [11,12].

With the aforementioned matrix transformation, we can establish the following conclusions.

Lemma 11.5

Let the networked system be described by Eqs. (11.14) and (11.15). The attack set KImage is identifiable if and only if this attack set is identifiable for the following networked system:

[ x 1 ( k + 1 ) x 2 ( k + 1 ) ] = [ A ¯ 11 A ¯ 12 0 A ¯ 22 ] [ x 1 ( k ) x 2 ( k ) ] + [ B ¯ K 0 ] d ( k ) ,

Image (11.56)

y ( k ) = [ C ¯ 1 C ¯ 2 ] [ x 1 ( k ) x 2 ( k ) ] .

Image (11.57)

Proof

Assume that the networked system described by Eqs. (11.56) and (11.57) is attack identifiable for the attack set KImage. Then, for each complex number λ and each attack set RImage with its attacker number not greater than k, if

[ λ I A ¯ 11 A ¯ 12 B ¯ K B R 1 0 λ I A ¯ 22 0 B R 2 C ¯ 1 C ¯ 2 0 D R ] [ x 1 ( 0 ) x 2 ( 0 ) g K g R ] = 0 ,

Image (11.58)

then it is necessary that all the vectors x1(0)Image, x2(0)Image, gKImage, and gRImage are zero vectors, that is,

[ x 1 T ( 0 ) x 2 T ( 0 ) g K T g R T ] T = 0 .

Image (11.59)

Note that

[ A ¯ 11 A ¯ 12 B ¯ K 0 A ¯ 22 0 C ¯ 1 C ¯ 2 0 ] = [ T [ A B K D K C + L C ] T 1 T B K [ I D K D K ] [ I D K D K ] C T 1 0 ] = [ T 0 0 I ] [ A B K D K C + L C B K [ I D K D K ] [ I D K D K ] C 0 ] [ T 1 0 0 I ] .

Image (11.60)

We therefore have that

[ λ I A ¯ 11 A ¯ 12 B ¯ K B R 1 0 λ I A ¯ 22 0 B R 2 C ¯ 1 C ¯ 2 0 D R ] = [ T 0 0 I ] × [ λ I [ A B K D K C + L C ] B K [ I D K D K ] T 1 [ B R 1 B R 2 ] [ I D K D K ] C 0 D R ] [ T 1 0 0 0 I 0 0 0 I ] .

Image (11.61)

As the matrix T is regular by its definition, it is obvious that both two matrices

[ T 0 0 I ] , [ T 1 0 0 0 I 0 0 0 I ]

Image

are well defined and are of full rank. We can therefore claim from Eqs. (11.58) and (11.61) that the matrix

[ λ I [ A B K D K C + L C ] B K [ I D K D K ] B R [ I D K D K ] C 0 D R ]

Image

is also of full column rank for each complex number λ and each attack set RImage with its attacker number not greater than that of the attack set KImage. Hence, from Lemma 11.4 and Theorem 11.3 we can declare attack identifiability for the attack set KImage in the networked system described by Eqs. (11.14) and (11.15).

Similar arguments show that if the attack set KImage is identifiable for the networked system described by Eqs. (11.14) and (11.15), then it is also identifiable for the networked system described by Eqs. (11.56) and (11.57).

This completes the proof. □

Note that in the state space model of Eqs. (11.56) and (11.57), the state vector x2(k)Image is completely isolated from any attack disturbances. Using this property, a residual filter can be constructed as

w ( k + 1 ) = [ A ¯ 22 L ¯ ( I C ¯ 1 C ¯ 1 ) ] w ( k ) L ¯ y ¯ ( k ) ,

Image (11.62)

y ¯ ( k ) = [ I C ¯ 1 C ¯ 1 ] y ( k ) ,

Image (11.63)

r K ( k ) = [ I C ¯ 1 C ¯ 1 ] C ¯ 2 w ( k ) y ¯ ( k ) .

Image (11.64)

This residual filter is very similar to that for attack detections given in the previous section, except that the output vector y(k)Image is replaced by a modified output vector y¯(k)Image. Obviously, this modification is removing C¯1x1(k)Image from the output vector y(k)Image with the purpose of constructing an output vector that is not affected by any attack disturbances, noting that the substate vector x2(k)Image is independent of the attack disturbance vector d(k)Image and

y ¯ ( k ) = [ I C ¯ 1 C ¯ 1 ] y ( k ) = [ I C ¯ 1 C ¯ 1 ] [ C ¯ 1 C ¯ 2 ] [ x 1 ( k ) x 2 ( k ) ] = [ I C ¯ 1 C ¯ 1 ] C ¯ 2 x 2 ( k ) .

Image

With this residual filter, we have the following results.

Theorem 11.5

Let the networked system be described by Eqs. (11.14) and (11.15). Assume that the attack set KImage is identifiable. Moreover, assume that the initial state vector x(0)Image of the networked system is known. Furthermore, assume that the initial state vector of the residual filter described by Eqs. (11.62)(11.64) is set as w(0)=x2(0)Image and that the gain matrix L¯Image is selected such that the matrix A¯22L¯(IC¯1C¯1)C¯2Image is stable. Then the residual signal r(k)Image of the above filter is constantly equal to zero if and only if both matrices BKImage and DKImage coincide with those of the actual colluding attackers.

Proof

Using the state vector w(k)Image of the attack identifier described by Eqs. (11.62)(11.64), define the new state vector w¯(k)Image evolving as

w ¯ ( k + 1 ) = A ¯ 11 w ( k ) + A ¯ 12 w ¯ ( k ) .

Image (11.65)

Moreover, define the error vector e(k)=col{e1(k),e2(k)}Image, in which two suberror vectors e1(k)Image and e2(k)Image are respectively defined as e1(k)=w¯(k)x1(k)Image and e2(k)=w(k)x2(k)Image. Then on the basis of Eqs. (11.14), (11.15), (11.62)(11.64), and (11.65), straightforward algebraic manipulations show that

e ( k + 1 ) = [ A ¯ 11 A ¯ 12 0 A ¯ 22 L ¯ ( I C ¯ 1 C ¯ 1 ) C ¯ 2 ] e ( k ) [ B ¯ K 0 ] d ( k ) ,

Image (11.66)

r K ( k ) = [ 0 ( I C ¯ 1 C ¯ 1 ) C ¯ 2 ] e ( k ) .

Image (11.67)

When w(0)Image is set to be equal to x2(0)Image, the definition of the suberror vector e2(k)Image implies that e(0)=0Image. As the residual signal rK(k)Image is not directly affected by the attack disturbances d(k)Image, and the matrix L¯Image is selected to assure that the magnitude of each eigenvalue of the matrix A¯22L¯(IC¯1C¯1)C¯2Image is smaller than 1, this residual signal rK(k)Image is constantly equal to zero, provided that the attack set is KImage. This completes the proof. □

In case that there exist modeling errors in the adopted state space model, and/or there are random disturbances in the networked system, and/or the initial state vector x(0)Image is not exactly known, observations in attack detections are also valid for the above attack identification methods. On the other hand, from the above results it is clear that to identify an attack, it is necessary to construct at least one residual filter for each possible attack. This might be not feasible when the number K is large.

Similarly to the attack detector discussed in the previous section, it is also possible to realize the above attack identifier in a distributed way. The details are not included in this book. Some preliminary results can be found in [8,9].

11.6 System Security and Sensor/Actuator Placement

The previous sections make it clear that to detect and/or identify an attack in networked systems, system observability is an essential property. In Chapter 3, we have discussed how to verify observability of a large-scale networked system and how many sensors are required to construct an observable networked system. However, it is usually not sufficient in actual engineering to only detect and/or identify an attack. A general requirement is to achieve this objective in an allowed time period after the occurrence of an attack, if not in the shortest time period. In fact, the longer the time is needed to detect/identify an attack, the larger the damages the attack may bring.

Note that both attack detection and attack identification depend on a residual generator. As system initial states cannot be known exactly in general and both modeling errors and external disturbances are usually unavoidable in actual applications, the time period and/or data length needed to detect/identify an attack depend heavily on the performances of that residual generator in its robustness against modeling errors and its capabilities of reducing influences from external disturbances. From the results of Section 11.5 it is clear that this residual generator can be easily rewritten into a form of the Luenberger observer discussed in Chapter 4, which includes the Kalman filter as its particular form. On the other hand, recall that the Kalman filter is optimal when the plant is linear and the external disturbances are normally distributed and when the sensitivity penalization-based robust state estimator given in Chapter 4 has completely the same form as that of the Kalman filter. It is interesting from both theoretical aspects and application aspects to investigate relations among estimation performances of the Kalman filter, the number of sensors, and the length of system output measurements [15]. A dual problem is about relations about control efforts, the number of actuators and optimal control performances, which is also an interesting topic in the analysis and synthesis of networked systems [16,17]. In fact, an attacker often intends to destroy a system with minimal efforts in a stealthy way, whereas a controller usually wants to improve system performances with the smallest energy consumptions.

In this section, we investigate a sensor placement problem and an actuator placement problem, both of them related closely to system security.

From a mathematical viewpoint, both sensor placement problems and actuator placement problems can be regarded as set function optimizations. When a sensor placement problem is under investigation, a set VImage is usually adopted to denote the set consisting of all potential positions of a networked system in which a sensor can be placed, whereas the function f()Image represents metric measuring factors like the best performances that a state estimator can achieve with the measurements from several sensors belonging to a set SImage, the costs of these sensors, and so on. Then, when there are restrictions on the number of sensors in a networked system, say, it does not exceed k, a sensor placement problem may be mathematically described as

max S V , | S | k f ( S ) .

Image (11.68)

Here |S|Image stands for the number of elements in the set SImage. Basically, this is a finite combinatorial optimization problem, which can be solved in principle through exhausting all possible sensor sets that have its element number not greater than k and comparing their associated cost function values. When the number of subsystems in a networked system is large, as possibilities of sensor locations in general increase exponentially with the subsystem number, such a brute-force-based method is usually computationally prohibitive.

Similar statements are valid for actuator placements.

To deal with these sensor/actuator placement problems, assume that the dynamics of a discrete time and linear time-variant system can be described by the following state space model:

x ( k + 1 ) = A ( k ) x ( k ) + B ( k ) w ( k ) ,

Image (11.69)

y ( k ) = C ( k ) x ( k ) + v ( k ) .

Image (11.70)

Once again, here, x(k)RnImage, y(k)RpImage, and w(k)RqImage denote respectively the state vector, input vector, and output vector of the system. Moreover, assume that SImage is a subspace of RnImage.

11.6.1 Some Properties of the Kalman Filter

It is now widely known that when a plant is linear and its external disturbances are normally distributed, the Kalman filter is the optimal state estimator under the criterion of mean squares errors [18,19]. Recent studies, however, show that when mean squares errors are adopted in the determination of optimal sensor locations, the associated optimization problem generally does not have the submodularity property and is in general NP-hard. Moreover, the widely adopted greedy heuristic algorithm may perform arbitrarily poorly, and there does not exist a constant-factor (polynomial-time) approximation algorithm [20,21].

To deal with the problem of appropriately locating sensors for a networked system, in this section, we investigate relations among estimation error, data length, and plant output number in the Kalman filtering. As only stochastic properties of B(k)w(k)Image affect estimation accuracies of the Kalman filter [18,19,22], we can assume without any loss of generality that B(k)InImage. To simplify mathematical expressions, this assumption is adopted throughout this and next subsections. Moreover, we assume that the system initial conditions, process disturbances, and measurement errors are white and uncorrelated. More precisely:

  • •  x(0)Image, w(k)Image, and v(k)Image are uncorrelated with each other for k0Image;
  • •  w(k)Image and v(k)Image are uncorrelated with w(k)Image and v(k)Image for every nonnegative integer kkImage;
  • •  E(x(0))=0Image and E(w(k))=E(v(k))=0Image for each k=0,1,Image;
  • •  Var(x(0))>0Image, Var(w(k))>0Image, and Var(v(k))=σ2IImage with σ>0Image for every nonnegative integer k.

Using the same arguments as in the derivation of Eq. (3.6), we can show straightforwardly from Eqs. (11.69) and (11.70) that, for arbitrary nonnegative integers k and kImage with 0kkImage, the following equality is valid:

[ y ( 0 ) y ( 1 ) y ( 2 ) y ( k ) x ( k ) ] = [ Γ ( k ) L ( k ) ] [ x ( 0 ) w ( 0 ) w ( 1 ) w ( 2 ) w ( k 1 ) ] + [ v ( 0 ) v ( 1 ) v ( 2 ) v ( k ) 0 ] ,

Image (11.71)

where

Γ ( k ) = [ C 0 0 0 0 0 C ( 1 ) A ( 0 ) C ( 1 ) 0 0 0 C ( 2 ) A ( 1 ) A ( 0 ) C ( 2 ) A ( 1 ) C ( 2 ) 0 0 C ( k ) j = k 1 0 A ( j ) C ( k ) j = k 2 0 A ( j ) C ( k ) j = k 3 0 A ( j ) C ( k ) j = k 4 0 A ( j ) C ( k ) ] , L ( k ) = [ j = k 1 0 A ( j ) j = k 2 0 A ( j ) I 0 0 ] .

Image

If x(0)Image, w(j)|j=0k1Image, and v(k)|j=0kImage are normally distributed, then we can declare from Eq. (11.71) that

[ y T ( 0 ) y T ( 1 ) y T ( 2 ) y T ( k ) x T ( k ) ] T

Image

is also normally distributed, which is equivalent to that

x ( k ) and [ y T ( 0 ) y T ( 1 ) y T ( 2 ) y T ( k ) ] T

Image

are jointly normally distributed.

For simplicity, denote the vectors

[ y T ( 0 ) y T ( 1 ) y T ( k ) ] T , [ x T ( 0 ) w T ( 0 ) w T ( 1 ) w T ( k 1 ) ] T , and [ v T ( 0 ) v T ( 1 ) v T ( k ) ] T

Image

respectively by Y(k)Image, Z(k1)Image, and V(k)Image. Then Eq. (11.71) can be rewritten as follows:

[ Y ( k ) x ( k ) ] = [ Γ ( k ) I L ( k ) 0 ] [ Z ( k 1 ) V ( k ) ] .

Image (11.72)

In addition, from the system output vector measurements y(j)Image, j=0,1,,kImage, the best estimate of the system state vector x(k)Image under the criterion of mean squares errors, denoted xˆ(k)Image, has been proven to have the following closed-form expression [18,19,22]:

x ˆ ( k ) = E ( x ( k ) | Y ( k ) ) ,

Image (11.73)

that is,

x ˆ ( k ) = arg min α E ( [ α x ( k ) ] T [ α x ( k ) ] )

Image (11.74)

under the constraints of Eq. (11.71).

Based on this relation and the adopted assumptions on the external disturbance process w(k)|k=0Image and the measurement error process v(k)|k=0Image, direct algebraic manipulations show that

[ Y ( k ) x ( k ) ] N ( [ 0 0 ] , [ Γ ( k ) Var ( Z ( k 1 ) ) Γ T ( k ) + σ 2 I Γ ( k ) Var ( Z ( k 1 ) ) L T ( k ) L ( k ) Var ( Z ( k 1 ) ) Γ T ( k ) L ( k ) Var ( Z ( k 1 ) ) L T ( k ) ] ) .

Image (11.75)

Concerning joint normal distributions, the following result is well known [23,24].

Lemma 11.6

Assume that random vectors x and y are jointly normally distributed. Moreover, assume that

E ( [ x y ] ) = [ μ x μ y ] , Var ( [ x y ] ) = [ V x V y x T V y x V y ] .

Image

Then the conditional random vector y|xImage is also normally distributed. In addition,

E ( y | x ) = μ y + V y x V x 1 [ x μ x ] , Var ( y | x ) = V y V y x V x 1 V y x T .

Image

On the basis of Eqs. (11.73) and (11.72) and Lemma 11.6, the following equality can be straightforwardly established:

x ˆ ( k ) = E ( x ( k ) | Y ( k ) ) = L ( k ) Var ( Z ( k 1 ) ) Γ T ( k ) [ Γ ( k ) Var ( Z ( k 1 ) ) Γ T ( k ) + σ 2 I ] 1 Y ( k ) .

Image (11.76)

Recall that when a linear space is constituted from n-dimensional random vectors with zero mean and finite covariance matrix, an inner product can be defined as

< x , y > = E ( x T y ) .

Image

Obviously, the norm induced from this inner product is consistent with the cost function adopted in the optimization problem of Eq. (11.74), that is, the mean squares errors. Hence, from the projection-based optimization condition [25,26] we have that

< x ( k ) x ˆ ( k ) , x ˆ ( k ) > = E ( [ x ( k ) x ˆ ( k ) ] T x ˆ ( k ) ) = 0 .

Image (11.77)

Denote the estimation error vector x(k)xˆ(k)Image by x˜(k)Image. From this equality and Eq. (11.75) direct algebraic operations show that

Var ( x ˜ ( k ) ) = E ( [ x ( k ) x ˆ ( k ) ] [ x ( k ) x ˆ ( k ) ] T ) = L ( k ) Var ( Z ( k 1 ) ) L T ( k ) L ( k ) Var ( Z ( k 1 ) ) Γ T ( k ) [ Γ ( k ) Var ( Z ( k 1 ) ) Γ T ( k ) + σ 2 I ] 1 × Γ ( k ) Var ( Z ( k 1 ) ) L T ( k ) = L ( k ) [ Var 1 ( Z ( k 1 ) ) + 1 σ 2 Γ T ( k ) Γ ( k ) ] 1 L T ( k ) .

Image (11.78)

A recursive realization of Eqs. (11.76) and (11.78) is the well-known Kalman filter [18,19]. Rather than the recursive formula of the Kalman filter, which is given by Eqs. (4.8) and (4.9), expression (11.78) for the covariance matrix of estimation errors appears to be much more convenient in the analysis of the relations among sensor positions, sensor number, measurement data length, and estimation accuracy.

Define the random variable

δ ( k ) = x ˜ T ( k ) Var 1 ( x ˜ ( k ) ) x ˜ ( k )

Image (11.79)

Recall that the random vector x(k)Image is normally distributed. On the other hand, it is clear from Eqs. (11.72) and (11.76) that the random vector xˆ(k)Image is also normally distributed. We can therefore declare that the estimation error vector x˜(k)Image is also normally distributed. In addition, note that E[E(y|x)]=E(y)Image for arbitrary random vectors x and y [23,24]. We therefore have

E ( x ˜ ( k ) ) = E ( x ( k ) x ˆ ( k ) ) = E ( x ( k ) ) E ( E ( x ( k ) | Y ( k ) ) ) = 0 .

Image (11.80)

Hence the random variable δ(k)Image has the χn2Image distribution, that is, the χ2Image distribution with n degrees of freedom [23,24]. This property is quite helpful in developing algorithms for attack detections and attack identifications [6,7].

It is also worth mentioning that

E { [ x ( k ) x ˆ ( k ) ] T [ x ( k ) x ˆ ( k ) ] } = E { tr ( [ x ( k ) x ˆ ( k ) ] [ x ( k ) x ˆ ( k ) ] T ) } = tr { E ( [ x ( k ) x ˆ ( k ) ] [ x ( k ) x ˆ ( k ) ] T ) } = tr { Var ( x ˜ ( k ) ) } .

Image (11.81)

These observations mean that the covariance matrix Var(x˜(k))Image of the estimation error is closely related to both some statistics adopted in attack detections/identifications and the estimation accuracy measured by the mathematical expectation of squared estimation errors.

11.6.2 Sensor Placements

In both attack detections and attack identifications, it is essential to distinguish an abnormal situation of a networked system from its normal situation as fast as possible. This means that it is preferable to have sensors in positions that can make a detector achieve the minimal value of a metric on its residual signal when there do not exist any attacks in the associated networked system and reach a value of that metric as large as possible if the associated networked system has been attacked. Keeping in mind relations among state observers and detectors, in this subsection, we investigate sensor placements with the objective of minimizing state estimation errors.

For this purpose, we adopt the following assumption for the system described by Eqs. (11.69) and (11.70) to reflect characteristics in optimizations of sensor locations.

Assumption 11.1

The matrix C(k)Image is time invariant. Moreover, each of its rows has a nonzero element equal to 1. Furthermore, each of its columns has at most one nonzero element.

This assumption reflects that all sensor positions have been fixed, each sensor only directly measures one of the states of the networked system, and if a state in the networked system is directly measured, then it can only be measured by one sensor. Under such an assumption, we can directly prove that the number of the sensors in the networked system is equal to the number of nonzero elements in the matrix C, which is an abbreviation of the matrix C(k)Image when Assumption 11.1 is satisfied.

Define the set

S = { i | C = [ c j l | j = 1 , l = 1 j = p , l = n ]   satisfies Assumption 11.1 , and   there exists   j { 1 , 2 , , p }   such that   c j i = 1 } .

Image

It is clear that this set indicates all the states in the networked system described by Eqs. (11.69) and (11.70) for which there is a sensor measuring its value. With a little abuse of terminology, in this subsection, we call this set the sensor set for that networked system.

For an arbitrary t×sImage-dimensional matrix Ξ, let (Ξ)ijImage represent its ith row jth column element, i=1,2,,tImage and j=1,2,,sImage. Define scalars μ(k)Image, vw(k)Image, v0Image, and v0inImage respectively as

μ ( k ) = max 0 i k σ max ( A ( i ) ) , v w ( k ) = max 0 i k max 1 j n ( Var ( w ( i ) ) ) j j , v 0 = max 1 j n ( Var ( x ( 0 ) ) ) j j , v 0 i n = max 1 j n ( Var 1 ( x ( 0 ) ) ) j j .

Image

On the basis of the results given in the previous subsection on the Kalman filtering, the following results have been obtained in [15], which establish some relations among the number of sensors, data length, and state estimation accuracy in a networked system.

Theorem 11.6

Let the networked system be described by Eqs. (11.69) and (11.70), and let xˆ(k)Image stand for the estimate of its state vector x(k)Image using the Kalman filter. Assume that σ>0Image and μ(k)1Image. Then for every k=1,2,Image,

n σ 2 λ min ( L T ( k ) L ( k ) ) | S | 1 μ 2 ( k + 1 ) 1 μ 2 + σ 2 v 0 i n E ( | | x ( k ) x ˆ ( k ) | | 2 2 ) n ( k + 1 ) λ max ( L T ( k ) L ( k ) ) max { v 0 , v w ( k ) } ,

Image (11.82)

where |S|Image is the number of elements in the sensor set SImage.

This theorem reveals that the lower bound of state estimation accuracy decreases only inversely proportionally to the number of the sensors in the system and increases linearly with the increment of the number of the states in the system. This means that estimation errors cannot be significantly reduced through only adding sensors to the system, and for a large-scale system, measurements with a short data length cannot lead to an acceptable estimation accuracy in general.

From this theorem conditions can be derived on the number of sensors and data length required to meet some prescribed estimation accuracy. More precisely, assume that it is required that E(||x(k)xˆ(k)||22)αImage with some prescribed positive number α. Then the following conclusions can be derived from Theorem 11.6.

  • •  Assume that the date length k is fixed. Then, to satisfy E(||x(k)xˆ(k)||22)αImage, it is necessary that

| S | [ n σ 2 λ min ( L T ( k ) L ( k ) ) α σ 2 v 0 i n ] 1 μ 2 1 μ 2 ( k + 1 )

Image

  • •  Assume that the number of sensors |S|Image, is fixed. Then, to satisfy E(||x(k)xˆ(k)||22)αImage, it is necessary that

k log { 1 [ n σ 2 λ min ( L T ( k ) L ( k ) ) α σ 2 v 0 i n ] 1 μ 2 | S | } 2 log ( μ ) 1 .

Image

These relations imply that the required number of sensors increases linearly with the increment of the number of the system states, but the required data length increases only logarithmically.

Now, we investigate relations between sensor sets and the covariance matrix of estimation errors, that is, Var(x˜(k))Image. For this purpose, define L(j)Image with j=0Image as L(0)=[In00]Image. Then from the definitions of the matrices Γ(k)Image and L(k)Image it is obvious that

Γ ( k ) = col { C L ( j ) | j = 0 k } .

Image (11.83)

Let ejImage denote the jth canonical basis of the n-dimensional Euclidean space RnImage, j=1,2,,nImage, that is, ejImage is an n-dimensional column vector with its jth row element 1 and all other elements 0. Moreover, for each i=1,2,,nImage, define the matrix

I ( i ) = e i e i T .

Image

Furthermore, let s(i)Image with i=1,2,,nImage, be the indicator function defined as

s ( i ) = { 1   if the i th state of the system is directly measured by a sensor, 0   if the i th state of the system is not directly measured by any sensor.

Image

By Assumption 11.1 the output matrix C of the networked system described by Eqs. (11.69) and (11.70) can be expressed as

C = col { e j ( i ) T | i = 1 p } ,

Image (11.84)

where j(i)Image is the position of the nonzero element in the ith row of the matrix C, i=1,2,,pImage. Moreover, the following equalities are valid:

C T C = m = 1 p e j ( m ) e j ( m ) T = m S I ( m ) = m = 1 n s ( m ) I ( m ) .

Image (11.85)

To clarify dependence of the matrices Γ(k)Image and Var(x˜(k))Image on the sensor set SImage, we further reexpress them respectively as Γ(k,S)Image and Var(x˜(k),S)Image, where the sensor set SImage is explicitly included. By Eqs. (11.83), (11.84), and (11.85) we have that

Γ T ( k , S ) Γ ( k , S ) = j = 0 k L T ( j ) C T C L ( j ) = j = 0 k ( L T ( j ) m = 1 n s ( m ) I ( m ) L ( j ) ) = m = 1 n s ( m ) ( j = 0 k L T ( j ) I ( m ) L ( j ) ) = m S O ( k , m ) ,

Image (11.86)

where

O ( k , m ) = j = 0 k L T ( j ) I ( m ) L ( j ) = j = 0 k ( e m T L ( j ) ) T ( e T ( m ) L ( j ) ) , k = 0 , 1 , , m = 1 , 2 , , n .

Image

From its definition it is clear that for all k=0,1,Image and m=1,2,,nImage, the matrix O(k,m)Image is independent of either the number of sensors in the system or the positions of the sensors. In addition, this matrix is at least semipositive definite.

Now, assume that there are two sensor sets S1Image and S2Image satisfying

S 1 S 2 { 1 , 2 , , n } .

Image

Then by Eq. (11.86) the following inequality is immediate:

Γ T ( k , S 2 ) Γ ( k , S 2 ) = m S 2 O ( k , m ) = m S 1 O ( k , m ) + m S 2 S 1 O ( k , m ) m S 1 O ( k , m ) = Γ T ( k , S 1 ) Γ ( k , S 1 ) .

Image (11.87)

We can therefore declare from Lemma 2.1 that

[ 1 σ 2 Γ T ( k , S 2 ) Γ ( k , S 2 ) + Var 1 ( Z ( k 1 ) ) ] 1 [ 1 σ 2 Γ T ( k , S 1 ) Γ ( k , S 1 ) + Var 1 ( Z ( k 1 ) ) ] 1 .

Image (11.88)

By Eqs. (11.78) and (11.88) and by Lemma 2.1 we further have that

Var ( x ˜ ( k ) , S 2 ) Var ( x ˜ ( k ) , S 1 )

Image (11.89)

Eqs. (11.89) and (11.81) mean that, with the addition of some sensors, both the mean squares estimation errors of the Kalman filter and the covariance matrix of its estimation errors monotonically decrease. This is consistent with engineering intuition, as a sensor addition results in more measurement data, which provide more information about the plant state vector and therefore lead to a more accurate state estimate.

However, when the number of sensors is fixed, it is still not very easy to find the optimal sensor positions that minimize either the mean squares estimation errors or the covariance matrix of the state estimation errors [15,20,21].

On the other hand, discussions in the previous section show that observability of a networked system is necessary for the existence of a residual filter that is able to detect/identify an attack. It can be shown, however, that even the problem of finding the best sensor positions that lead to an observable networked system is NP-hard.

In particular, similarly to the proof of Theorem 3.2, we can also show that to recover the state vectors of the networked system described by Eqs. (11.69) and (11.70) from its output measurements y(j)|j=0kImage, it is necessary and sufficient that the following matrix O(k,S)Image w is of full column rank:

O ( k , S ) = [ C T A T ( 0 ) C T ( A ( 1 ) A ( 0 ) ) T C T ( j = k 0 A ( j ) ) T C T ] T .

Image

Note that the matrix O(k,S)Image is of full column rank if and only if the matrix OT(k,S)O(k,S)Image is of full rank. In fact, the ranks of these two matrices equal each other [27,28]. When sensor positions are selected to guarantee the observability of a networked system, it is reasonable from these observations to investigate the problem of maximizing the rank of the matrix OT(k,S)O(k,S)Image under the restrictions that S{1,2,,n}Image and |S|qImage.

Concerning the set function involved in this optimization problem, the following conclusions are obtained.

Theorem 11.7

Assume that the dynamics of a networked system is described by Eqs. (11.69) and (11.70). Moreover, assume that Assumption 11.1 is satisfied by its output matrix C(k)Image. Define the set function f(S):2VRImage as

f ( k , S ) = rank ( O T ( k , S ) O ( k , S ) ) ,

Image (11.90)

where V={1,2,,n}Image. Then this set function is submodular and increasing for each k=0,1,2,Image.

The proof of this theorem is deferred to the appendix of this chapter.

From this theorem and the results on optimization of the set function given in Chapter 2 we can declare that optimal sensor placements for maximizing the rank of the observability matrix O(k,S)Image are in general NP-hard. On the other hand, the greedy heuristic method of that chapter usually gives a good approximation result. But it is worth noting that when the scale of a networked system is large, some computation issues may still arise for this greedy heuristic method, noting that in this case, the matrix O¯(k,i)Image of Eq. (11.A.1) often has a high dimension. These are different significantly by the results of Chapter 3, given by Corollary 3.1 and Theorem 3.10. The output matrix there is allowed to take an arbitrary real value for a networked system, and an explicit parameterization is given for all output matrices that lead to an observable system.

When the state transition matrix of the networked system described by Eqs. (11.69) and (11.70) is also time invariant, arguments similar to those of Chapter 3 show that when the observability matrix O(k,S)Image is involved in an optimization problem, it is sufficient to only consider the case k=n1Image.

11.6.3 Actuator Placements

In the previous sections, it has been argued that an attacker usually intends to give destructive damages to a system in a stealthy way with small efforts. In this subsection, we investigate selection of input positions for the system described by Eqs. (11.69) and (11.70) such that some metrics on the input efforts can be minimized. For this purpose, we further adopt the following assumption, which is in a dual form of Assumption 11.1 adopted in the previous subsection for studying sensor placements.

Assumption 11.2

The matrix B(k)Image is time invariant. Moreover, each of its columns has a nonzero element 1. Furthermore, each of its rows has at most one nonzero element.

When this assumption is satisfied, the input matrix B(k)Image is abbreviated as B for simplicity.

Concerning the system described by Eqs. (11.69) and (11.70), its controllability Gramian at the time instant k, denoted Wc(k)Image, is defined as

W c ( k ) = j = 0 k Φ ( j , 0 ) B B T Φ T ( j , 0 ) ,

Image (11.91)

where Φ(j,0)=i=j0A(i)Image. Note that inputs to a system are usually energy restricted. Moreover, some functions of the controllability Gramian Wc(k)Image have been argued to be nice quantities for measuring energies required to maneuver the system state vector from the zero-valued initial conditions to a prescribed value in the time interval [0,k]Image. Depending on whether the average energy required to maneuver the system state vector, a worst-case energy required to maneuver the system state vector or the volume of the states that can be reached through one unit or less of input energies, tr(Wc1(k))Image, σmax(Wc1(k))Image and logdet(Wc(k))Image are utilized respectively [16,17].

For each i=1,2,,nImage, let a(i)Image with i=1,2,,nImage, be an indicator function defined as

a ( i ) = { 1   if the i th state of the system is directly maneuvered by an actuator, 0   if the i th state of the system is not directly maneuvered by any actuator.

Image

By Assumption 11.2 the input matrix B of the networked system can be expressed as

B = [ e j ( 1 ) , e j ( 2 ) , , e j ( q ) ] ,

Image (11.92)

where j(i)Image is the position of the nonzero element in the ith column of the input matrix B, i=1,2,,qImage. Moreover, similarly to Eq. (11.85), we can obtain the following relation:

B B T = m = 1 n a ( m ) I ( m ) ,

Image (11.93)

where the matrix I(m)Image is defined as that in the previous subsection.

Define the set

A = { i | B = [ b j l | j = 1 , l = 1 j = n , l = q ]   satisfies Assumption 11.2 , and   there exists   j { 1 , 2 , , q }   such that   b i j = 1 } .

Image

Then it is clear that this set indicates all the states in the networked system described by Eqs. (11.69) and (11.70) that are directly controlled by an actuator. Similarly to the sensor set SImage, in this subsection, we call this set the actuator set for that networked system.

To clarify dependence of the controllability Gramian Wc(k)Image on the actuator set AImage, we further reexpress it as Wc(k,A)Image, so that the sensor set SImage is explicitly included. Using completely the same arguments as those in the derivation of Eq. (11.86), from Eqs. (11.91) and (11.93) we obtain the following relation:

W c ( k , A ) = m A a ( m ) C ( k , m ) ,

Image (11.94)

where

C ( k , m ) = j = 0 k Φ ( j , 0 ) I ( m ) Φ T ( j , 0 ) , k = 0 , 1 , , m = 1 , 2 , , n .

Image

It is clear from its definition that, for all k=0,1,Image and m=1,2,,nImage, the matrix C(k,m)Image is independent of either the number of actuators in the system or the positions of the actuators. In addition, this matrix is at least semipositive definite.

Now, assume that there are two actuator sets A1Image and A2Image satisfying

A 1 A 2 { 1 , 2 , , n } .

Image

Then by Eq. (11.94) we can obtain the following inequality by the same arguments as those in the derivation of Eq. (11.87):

W c ( k , A 2 ) W c ( k , A 1 ) .

Image (11.95)

We can therefore declare from Lemma 2.1 that when both matrices Wc(k,A1)Image and Wc(k,A2)Image are invertible, the following inequalities are valid:

tr ( W c 1 ( k , A 2 ) ) tr ( W c 1 ( k , A 1 ) ) ,

Image (11.96)

σ max ( W c 1 ( k , A 2 ) ) σ max ( W c 1 ( k , A 1 ) ) ,

Image (11.97)

log det ( W c ( k , A 2 ) ) log det ( W c ( k , A 1 ) ) ,

Image (11.98)

that is, through adding some other actuators, both the average energy required to maneuver the system state vector and the worst-case energy required to maneuver the system state vector decrease monotonically, whereas the volume of the states that can be reached through one unit or less of input energies increases monotonically. Once again, this is in a good agreement with engineering intuition.

However, optimization of these metrics is generally mathematically difficult. In particular, we have the following results.

Theorem 11.8

Let the networked system be described by Eqs. (11.69) and (11.70). Define the set function f(A):2VRImage as

f ( A ) = log det ( W c ( k , A ) ) ,

Image (11.99)

where V={1,2,,n}Image. Then this set function is submodular and increasing.

Proof

From Eq. (11.98) we can directly declare that this function is increasing.

To prove its submodularity, assume that ξ is an arbitrary element of the set VImage. Moreover, assume that A1Image and A2Image are two sets satisfying A1A2V{ξ}Image. Define the matrix Wc(γ)Image and the function g(γ)Image respectively as

W c ( γ ) = W c ( k , A 1 ) + γ [ W c ( k , A 2 ) W c ( k , A 1 ) ] , g ( γ ) = log det [ W c ( γ ) + W c ( k , { ξ } ) ] log det ( W c ( γ ) ) ,

Image

where γ[0,1]Image.

By Eq. (11.95) we have that

W c ( k , A 2 ) W c ( k , A 1 ) 0 .

Image (11.100)

On the other hand, noting that by its definition the matrix Wc(k,{ξ})Image is at least positive semidefinite, it is obvious that

W c ( γ ) + W c ( k , { ξ } ) W c ( γ ) .

Image (11.101)

We can therefore declare from Lemma 2.1 that

d g ( γ ) d γ = d d γ log det ( W c ( γ ) + W c ( k , { ξ } ) ) d d γ log det ( W c ( γ ) ) = tr { [ W c ( γ ) + W c ( k , { ξ } ) ] 1 [ W c ( k , A 2 ) W c ( k , A 1 ) ] } tr { W c 1 ( γ ) [ W c ( k , A 2 ) W c ( k , A 1 ) ] } = tr { [ ( W c ( γ ) + W c ( k , { ξ } ) ) 1 W c 1 ( γ ) ] [ W c ( k , A 2 ) W c ( k , A 1 ) ] } = tr { [ W c ( k , A 2 ) W c ( k , A 1 ) ] 1 2 [ ( W c ( γ ) + W c ( k , { ξ } ) ) 1 W c 1 ( γ ) ] × [ W c ( k , A 2 ) W c ( k , A 1 ) ] 1 2 } 0 ,

Image (11.102)

which implies that g(0)g(1)Image.2

On the other hand, from the definition of the function g(γ)Image it is obvious that

g ( 0 ) = log det ( W c ( k , A 1 ) + W c ( k , { ξ } ) ) log det ( W c ( k , A 1 ) ) ,

Image (11.103)

g ( 1 ) = log det ( W c ( k , A 2 ) + W c ( k , { ξ } ) ) log det ( W c ( k , A 2 ) ) .

Image (11.104)

Note that from Eq. (11.94) we immediately have that

W c ( k , A { ξ } ) = W c ( k , A ) + W c ( k , { ξ } )

Image

for all AV{ξ}Image and ξVImage. Hence we can claim that g(0)Image and g(1)Image are in fact equal to the values of the following set function fξ(A)Image respectively at A=A1Image and A=A2Image:

f ξ ( A ) = log det [ W c ( k , A { ξ } ) ] log det ( W c ( k , A ) ) , A V { ξ } .

Image

The proof can now be completed by an application of Lemma 2.8. □

In addition to the set function logdet(Wc(k,A))Image, it has also been proven in [16,17] that some other set functions, such as rank(Wc(k,A))Image, tr(Wc1(k,A))Image, etc., are also monotone increasing and submodular. This means that maximization of these set functions is in general NP-hard, but the greedy heuristic algorithm described in Chapter 2 usually works well approximately. Once again, as in the sensor placement problem, computation issues may arise for a large-scale system in the application of that greedy heuristic algorithm, noting that the associated matrix in this case still has a great dimension, which is not very convenient for numerical computations.

On the other hand, from the duality between controllability and observability of a system, similiar results can be established for sensor placements using the associated observability Gramian.

In the above investigations on sensor/actuator placements, the output/input matrix of the system is restricted by Assumption 11.1/11.2. The associated results can be extended to a more general situation in which each row of the output matrix can take values from a set of some candidate rows, or each column of the input matrix can take values from a set of some candidate columns. The major required modification in these extensions is a redefinition of the matrix I(i)Image adopted in the aforementioned investigations. Details can be found in the related literature, such as [16,17] and the references therein.

11.7 Concluding Remarks

Importance of safety is now widely recognized for a networked system, especially with the ambitious introduction of public communication channels into a control system. In this chapter, we investigated some basic issues in system designs closely related to attack detection and identification, which reveals some relations among attack detectability/identifiability, system controllability/observability, and actuator/sensor placement. To develop an algorithm applicable to actual engineering problems, some other factors, such as robustness against modeling errors, stochastic properties of external disturbances, and measurement errors, must also be taken into account.

11.8 Bibliographic Notes

Network safety has been extensively investigated in power systems. [6] summarizes various important aspects of this problem and some interesting recent advancements. Recent interest in this problem are mainly stimulated by the prospective introduction of public communication channels into control systems, which brings some new characteristics into this field. Many essential issues are still in their primary stages toward a complete settlement. [2,7] are some recent surveys about this problem.

Basic concepts and results for combinatorial optimizations can be found in [26,30]. Concerning with convex optimization, [25,31] provides an excellent introduction on its essential motivations and results.

The notion of a system matrix is due to Rosenbrock, who also introduced zeros of linear multivariable systems in terms of the Smith–McMillan form of the system transfer matrix in [10,13]. The concept of weakly unobservable subspace appears to be firstly introduced for a discrete-time system by Silverman [32]. A summary and some detailed discussions can be found in [11]. Connections between the spectral properties of a weakly unobservable subspace and the system zeros are also dealt with in [33], and [34] has studied connections between the system zeros and the state space geometric structure of the system.

Appendix 11.A

11.A.1 Proof of Theorem 11.7

From the definition of the matrix O(k,S)Image and Eq. (11.85) it is obvious that

O T ( k , S ) O ( k , S ) = [ C T A ( 0 ) T C T ( A ( 1 ) A ( 0 ) ) T C T ( j = k 0 A ( j ) ) T C T ] × [ C T A ( 0 ) T C T ( A ( 1 ) A ( 0 ) ) T C T ( j = k 0 A ( j ) ) T C T ] T = l = 0 k { ( j = l 0 A ( j ) ) T C T C ( j = l 0 A ( j ) ) } = l = 0 k { ( j = l 0 A ( j ) ) T ( i S I ( i ) ) ( j = l 0 A ( j ) ) } = i S l = 0 k { ( j = l 0 A ( j ) ) T I ( i ) ( j = l 0 A ( j ) ) } = i S O ¯ ( k , i ) ,

Image (11.A.1)

where

O ¯ ( k , i ) = l = 0 k { ( j = l 0 A ( j ) ) T I ( i ) ( j = l 0 A ( j ) ) } , i = 1 , 2 , , n .

Image

It is clear that, for each i=1,2,,nImage, the matrix O¯(k,i)Image is at least positive semidefinite. Hence, for arbitrary sensor sets S1Image and S2Image satisfying S1S2{1,2,,n}Image, the following relations are valid:

O T ( k , S 2 ) O ( k , S 2 ) = m S 2 O ¯ ( k , m ) = m S 1 O ¯ ( k , m ) + m S 2 S 1 O ¯ ( k , m ) m S 1 O ¯ ( k , m ) = O T ( k , S 1 ) O ( k , S 1 ) .

Image (11.A.2)

Let λi()Image denote the ith eigenvalue of an n×nImage-dimensional symmetric matrix that satisfies λ1()λ2()λn()Image. By Lemma 2.1 and the last equation we have that, for each i=1,2,,nImage,

λ i { O T ( k , S 2 ) O ( k , S 2 ) } λ i { O T ( k , S 1 ) O ( k , S 1 ) } .

Image (11.A.3)

Note that, for a positive semidefinite matrix, its rank is equal to the number of its nonzero eigenvalues. Moreover, each of its eigenvalues is not smaller than 0. We can therefore declare from Eq. (11.A.3) that

rank { O T ( k , S 2 ) O ( k , S 2 ) } rank { O T ( k , S 1 ) O ( k , S 1 ) } ,

Image (11.A.4)

that is, the set function f(k,S)Image of Eq. (11.90) is increasing.

To prove its submodularity, define the derived set function fi(k,S)Image for each i{1,2,,n}Image as

f i ( k , S ) = rank ( O T ( k , S { i } ) O ( k , S { i } ) ) rank ( O T ( k , S ) O ( k , S ) ) , S { 1 , 2 , , n } { i } .

Image (11.A.5)

Note that, for arbitrary n×nImage-dimensional real matrices X and Y,

rank ( X + Y ) = rank ( X ) + rank ( Y ) dim ( S p a n ( X ) S p a n ( Y ) ) ,

Image

where dim()Image stands for the dimension of a subspace [27,28]. From this property, the definition of the set function fi(k,S)Image, and from Eq. (11.A.1) we have that

f i ( k , S ) = rank ( j S { i } O ¯ ( k , j ) ) rank ( j S O ¯ ( k , j ) ) = rank ( O ¯ ( k , i ) + j S O ¯ ( k , j ) ) rank ( j S O ¯ ( k , j ) ) = rank ( O ¯ ( k , i ) ) dim ( S p a n ( O ¯ ( k , i ) ) S p a n ( j S O ¯ ( k , j ) ) ) .

Image (11.A.6)

Assume now that S1S2{1,2,,n}{i}Image. Then from the relation

m S 2 O ¯ ( k , m ) = m S 1 O ¯ ( k , m ) + m S 2 S 1 O ¯ ( k , m )

Image

and the definition of the matrix O¯(k,m)Image, m=1,2,,nImage, the following relation can be established by some algebraic manipulations:

S p a n ( j S 2 O ¯ ( k , j ) ) S p a n ( j S 1 O ¯ ( k , j ) )

Image (11.A.7)

Therefore

{ S p a n ( O ¯ ( k , i ) ) S p a n ( j S 2 O ¯ ( k , j ) ) } { S p a n ( O ¯ ( k , i ) ) S p a n ( j S 1 O ¯ ( k , j ) ) } ,

Image (11.A.8)

which further leads the following inequality:

dim { S p a n ( O ¯ ( k , i ) ) S p a n ( j S 2 O ¯ ( k , j ) ) } dim { S p a n ( O ¯ ( k , i ) ) S p a n ( j S 1 O ¯ ( k , j ) ) } .

Image (11.A.9)

Substituting this inequality into Eq. (11.A.6), we have that, under the condition that i{1,2,,n}Image and S1S2{1,2,,n}{i}Image, the following inequality is true:

f i ( k , S 2 ) f i ( k , S 1 ) ,

Image (11.A.10)

that is, this derived set function is decreasing.

We can therefore declare from Lemma 2.8 that the set function f(k,S)Image is submodular. This completes the proof. □

References

[1] G. Richards, Hackers vs slackers, Engineering Technology 2008;3.

[2] J.P. Farwell, R. Rohozinski, Stuxnet and the future of cyber war, Survival 2011;53.

[3] J. Slay, M. Miller, Lessons learned from the Maroochy water breach, Critical Infrastructure Protection 2007;253:73–82.

[4] S. Kuvshinkova, SQL slammer worm: lessons learned for consideration by the electricity sector, North American Electric Reliability Council 2003.

[5] J.P. Conti, The day the samba stopped, Engineering Technology 2010;5:46–47.

[6] U. Hager, C. Rehtanz, N. Voropai, eds. Monitoring, Control and Protection of Interconnected Power Systems. Berlin, Heidelberg: Springer-Verlag; 2014.

[7] H. Sandberg, S. Amin, K.H. Johansson, eds. Special Issue on Cyberphysical Security. IEEE Control Systems Magazine 2015;36:20–127.

[8] F. Pasqualetti, F. Dörfler, F. Bullo, Attack detection and identification in cyber-physical systems, IEEE Transactions on Automatic Control 2013;58:2715–2729.

[9] H. Fawzi, P. Tabuada, S. Diggavi, Secure estimation and control for cyber-physical systems under adversarial attacks, IEEE Transactions on Automatic Control 2014;59:1454–1467.

[10] K.M. Zhou, J.C. Doyle, K. Glover, Robust and Optimal Control. Upper Saddle River, New Jersey: Prentice Hall; 1996.

[11] H.L. Trentelman, A.A. Stoorvogel, M. Hautus, Control Theory for Linear Systems. London, UK: Springer; 2002.

[12] T. Geerts, Invariant subspaces and invertibility properties for singular systems: the general case, Linear Algebra and Its Applications 1993;183:61–88.

[13] H.H. Rosenbrock, State-Space and Multivariable Theory. New York, USA: John-Wiley; 1970.

[14] J.M. Dion, C. Commault, J. der Woude, Generic properties and control of linear structured systems: a survey, Automatica 2003;39:1125–1144.

[15] V. Tzoumas, A. Jadbabaie, G.J. Pappas, Sensor placement for optimal Kalman filtering: fundamental limits, submodularity, and algorithms, arXiv:1509.08146v3 [math.OC]; 2016.

[16] T.H. Summers, F.L. Cortesi, J. Lygeros, On submodularity and controllability in complex dynamical networks, IEEE Transactions on Control of Network Systems 2016;3:91–101.

[17] V. Tzoumas, M.A. Rahimian, G.J. Pappas, A. Jadbabaie, Minimal actuator placement with bounds on control effort, arXiv:1409.3289v5 [math.OC]; 2016.

[18] A.E. Bryson, Y.C. Ho, Applied Optimal Control: Optimization, Estimation and Control. New York, USA: Taylor & Francis; 1975.

[19] T. Kailath, A.H. Sayed, B. Hassibi, Linear Estimation. Upper Saddle River, New Jersey: Prentice Hall; 2000.

[20] L.T. Ye, S. Roy, S. Sundaram, On the complexity and approximability of optimal sensor selection for Kalman filtering, arXiv:1711.01920v1 [math.OC]; 2017.

[21] H.T. Zhang, R. Ayoub, S. Sundaram, Sensor selection for Kalman filtering of linear dynamic systems: complexity, limitations and greedy algorithms, Automatica 2017;78:202–210.

[22] D. Simon, Optimal State Prediction: Kalman, HImage and Nonlinear Approaches. Hoboken, New Jersey, USA: Wiley-Interscience, A John Wiley & Sons, Inc., Publication; 2006.

[23] Y.S. Chow, H. Teicher, Probability Theory: Independence, Interchangeability, Martingales. 3rd edition New York, USA: Springer-Verlag; 1997.

[24] G.R. Grimmett, D.R. Stirzaker, Probability and Random Processes. USA: Oxford University Press; 2001.

[25] D.P. Bertsekas, Convex Optimization Theory. Boston, USA: Athena Scientific; 2009.

[26] L.A. Wolsey, Integer Programming. USA: John Wiley & Sons; 1988.

[27] F.Z. Zhang, Matrix Theory: Basic Results and Techniques. New York: Springer; 1999.

[28] R.A. Horn, C.R. Johnson, Topics in Matrix Analysis. Cambridge, UK: Cambridge University Press; 1991.

[29] P. Lancaster, M.T. Tismenetsky, The Theory of Matrices: With Applications. New York: Academic; 1985.

[30] L. Lovasz, Submodular functions and convexity, A. Bachem, M. Grotschel, B. Korte, eds. Mathematical Programming: The State of the Art. Bonn, Germany: Springer; 1983.

[31] J.B. Hiriart-Urruty, C. Lemarechal, Fundamentals of Convex Analysis. Berlin, Germany: Springer; 2001.

[32] L. Silverman, Discrete Riccati equations: alternative algorithms, asymptotic properties, and system theory interpretations, Control and Dynamical Systems 1976;12:313–386.

[33] W.M. Wonham, Linear Multivariable Control: A Geometric Approach. 3rd edition New York, USA: Springer Verlag; 1985.

[34] A.S. Morse, Structural invariants of linear multivariable systems, SIAM Journal on Control and Optimization 1973;11:446–465.


1  “When Ckx¯(k)Image belongs to the space spanned by the column vectors of the matrix DkImage, there may exist infinitely many vectors d¯(k)Image that satisfy Eq. (11.8). A complete parameterization for all the solutions to this equation is available, which is given by Theorem 2.5. As this is not a very essential issue here, we omit the details and only use DkCkx¯(k)Image in the discussions.”

2  “In the above derivations, we applied the formula ddγlogdet(X(γ))=tr{X1(γ)dX(γ)dγ}Image. To guarantee the validness of this formula, it is necessary that the matrix X(γ)Image is invertible [27,29]. When this condition is not satisfied, replace the matrix Wc(k,A)Image with the matrix Wc(k,A)+εIImage in the definition of the set function f(A)Image, where ε is an arbitrary positive number. Then, the matrix Wc(k,A)+εIImage is positive definite and therefore invertible. The conclusions of Theorem 11.8 can be obtained by letting ε approach zero.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.15.149