10. Cognitive Control for Communications-Based Train Control Systems (3/7)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Cognitive Control for CBTC Systems ◾ 223

JxxQxx uRu

kk k

=− −+

→∞

∑

[( )(

)]



(10.9)

where:



is the desired state of the train

is the actual state of the train

is semipositive denite

is positive denite

In fact, the rst term is the state tracking error and the second term is the control

magnitude, which is related to energy consumption.

When the cognitive control is applied, the resulting cost function can be

obtained according to Equation 10.2

JxxQxx uR

kk k

=− −+ +

→∞

∑

[( )(

)]



β (10.10)

where:



is the information gap in the

th communication cycle

is a scalar

e objective of the optimal control is to realize ecient train operations and the

minimum energy consumption through decreasing the information gap.

10.4.2 Channel Model in MIMO-Enabled WLANs

In order to optimize the performance of CBTC systems, we build a nite-state

Markov channel (FSMC) model based on the real eld measurements. FSMC mod-

els have been widely accepted in the literature as an eective approach to characterize

wireless channels, including high-speed railway channels [21], satellite channels [22],

and Rayleigh fading channels [23]. In FSMC models, the SNR range of the received

signal can be partitioned into nonoverlapping levels. en the received SNR can be

modeled as a random variable evolving according to an FSMC with state transition

probabilities, which can be obtained from real eld channel measurements.

Due to the eect of large-scale fading, the amplitude of SNR depends on the

distance between the transmitter and the receiver. It is obvious that the SNR is

usually high when the receiver is close to the transmitter; whereas it is low when

the receiver is far away from the transmitter. As a result, the transition probability

from the high channel state to the low channel state is dierent when the receiver

is near or far away from the transmitter, which means that the Markov state transi-

tion probability is related to the location of the receiver. erefore, only one state

transition probability matrix, which is independent of the location of the receiver,

may not accurately model the channels.

224 ◾ Advances in Communications-Based Train Control Systems

As a result, we divide the communication coverage of one access point (AP)

into

intervals. For each interval, we use Lloyd–Max method to partition the

SNR amplitude into several levels, which are nonuniformly distributed. e non-

uniformed partitioning can be useful to obtain more accurate estimates of system

performance measures [24]. Specically,

,{1,2,...,

}

∈ is the state transition

probability matrix corresponding to the

th interval, and the relationship between

the transition probability and the location of the receiver can be built. en,

is the state transition probability from state

to state

in the

th interval. As a result,

the state probabilities and the state transition probabilities can be dened as follows:

pP s

pnj

=−

{=}

{= =}

γγ|

|if || >1

1, {1,2,3,..., }

pn N

∑

=∀∈

(10.11)

where:

is the probability of being in state

in the

th interval

is the channel state in time slot

in the

th interval

In this chapter, a MIMO-enabled WLAN is used in CBTC systems. A MIMO

system can provide two types of gains: diversity gain and spatial multiplexing

gain, which may not be obtained simultaneously. ere is a fundamental trade-o

between diversity gain and spatial multiplexing gain [25]: higher spatial multiplex-

ing gain comes at the price of sacricing diversity. We can achieve the optimal

diversity gain

()

with long enough block length as follows:

dr mr

() ()

()

=− − (10.12)

where:

is the spatial multiplexing gain, which is an integer

and

are the numbers of transmitting and receiving antennas, respectively,

which correspond to the numbers of the AP antennas and the MS antennas

In particular,

max

= and

max

min= {,

}

With a multiplexing gain

, the data rate

Cr()

and the bit error rate (BER) prob-

ability

br()

can be approximated as [26]

Cr kr

br k

()

−

log

(10.13)

where:

and

are positive constants for dierent coding schemes

Cognitive Control for CBTC Systems ◾ 225

As a result, the data rate and the BER can be obtained when the multiplexing gain

and the diversity gain are obtained through Equation 10.12, which can be used to

derive the performance of the higher layers.

10.4.3 Q-Learning in the Cognitive Control Approach

e overall systems can be considered as a discrete-time event system. As the wire-

less channel is modeled as an FSMC channel, each channel state can only transit

to the adjacent channel states. In order to utilize Q-learning algorithm, the system

states, actions, and rewards should be identied. e RL model in the cognitive

control approach is shown in Figure 10.5.

10.4.3.1 System States and Actions

In the Q-learning model, the cognitive controller on the train should decide if a

hando procedure happens from the connected AP to the other available AP at

each communication cycle, where we assume that the place occupied by a train can

only be covered by two successive APs. As mentioned earlier, the multiplexing gain

System

(the front train)

Cognitive controller

(decision maker)

Environment

(wireless channel)

Dynamic system

ATO

Mobile station

Reward

Agent

(the back train)

Action

System state

Channel state

Agent state

Figure10.5 RL model in the cognitive control approach.

226 ◾ Advances in Communications-Based Train Control Systems

needs to be determined to improve the channel conditions. e hando action is

denoted as

at the

th communication cycle and the multiplexing gain action is

. en, the current action

∈

= {,

}

Corresponding to the actions, the current states should indicate the physical

layer of wireless communications and the hando procedure, such as the SNR levels

of two adjacent APs. Because our FSMC model is related to the distance between

the transmitter and the receiver, the channel state is given as

and

, which are

the SNR levels of two successive APs in the

th interval. As a result, the current state

∈

is denoted as

= {,

γγID , where

is the identication number of the

current associated AP. When

changes, the hando procedure happens.

10.4.3.2 Reward Function

When each action is taken, the system earns deterministic reward, which is used to

demonstrate the eects of the action on the system. In our Q-learning model, the

linear quadratic cost should be minimized according to the optimization objective

shown in Equation 10.10, which includes the guidance trajectory tracking error, the

control magnitude, and the information gap. As a result, we take the reciprocal of

the linear quadratic cost as the reward function. Hence, the communication delay

may aect the performance of the tracking, which can cause frequent acceleration

and deceleration and increase the energy assumption. ere are two kinds of com-

munication delay: delay with handos (hando latency) and delay without hand-

os. erefore, we should rst study the communication delay without handos.

IEEE 802.11g WLANs are applied as the main method of the train–ground

communication in CBTC systems, where carrier sense multiple access/collision

avoidance (CSMA/CA) is used in the media access control (MAC) layer. When the

train is running with high speed, the wireless channel can be aected due to the

Doppler frequency shift, reections, and other factors. e performance of the phys-

ical layer will be decreased, such as packet loss rate (PLR). e packet loss will bring

retransmission of data packets according to the automatic repeat request scheme

with CSMA/CA, which can cause time delay. According to Equation 10.13, the

BER can be obtained. en, the corresponding frame error rate (FER) is derived.

fr br

() 1[1()]=− − (10.14)

where:

is the length of one frame whose unit is bit

For a given FER, the PLR can be obtained.

pr fr fr() () [1

()

]

=−

−α

(10.15)

where:

is the times of packet retransmissions

Cognitive Control for CBTC Systems ◾ 227

Generally, there are two factors that can cause the packet loss: (1) the packet collision

and (2) the channel error. In the CBTC scenarios, there may not be many trains

occupying one AP’s coverage. As a result, there may not be packet collisions. en,

the only factor we are concerned about in this chapter is the channel error that can

lead to packet retransmissions. First of all, we can derive that the time delay with-

out retransmission (

α=0

) is caused by the random access scheme according to the

CSMA/CA method in 802.11g [27].

DifsTime aDataTimeaSifsTime

anACKTimeaPropTime

=+ +

(10.16)

where:

aDif

sTim

is the period of distributed interframe space (DIFS)

aDataTime

is the time for the transmitter to send a data frame and determined

by the ratio of the length of the frame

and the current data rate

Cr()

aSif

sTime is the period of short interframe space (SIFS)

anACKTime

is the time for the transmitter to send an acknowledge frame

aPropTime

is the propagation time

When the packet loss happens, the retransmission scheme is triggered with a backo

time, which is uniformly distributed in the range

[0,1]

−

, where

is the con-

tention window. For the rst backo,

is initialized as

. en, each trans-

mission attempt will double

until it reaches

. In the 802.11 standard [27],

CW =

{16,32,64,128,256,512,

1024}

and

= =

16, 1024. en the

general expression of

can be denoted as

≤≤











2, 6

(10.17)

where

is the times of retransmissions.

As a result, the backo time can be denedas

BackoffTimeRando

SlotTime

αα

=−×

([0,1])CW (10.18)

where:

aSlotTime

is a constant time corresponding to IEEE standard of 802.11g [27]

Hence, we can derive the time delay caused by retransmissions.

ackoffTimeaPropTimes

aDifsTimeaDataTime a

+++

∑

()

( SSifsTime)

(10.19)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10. Cognitive Control for Communications-Based Train Control Systems (3/7)

Create new playlist

Sign In

Sign Up

Table of Contents for
10. Cognitive Control for Communications-Based Train Control Systems (3/7)