Cognitive Control for CBTC Systems 223
JxxQxx uRu
k
kk
T
kk k
T
k
=− −+
→∞
[( )(
)]

(10.9)
where:
x
k
is the desired state of the train
x
k
is the actual state of the train
Q
is semipositive denite
R
is positive denite
In fact, the rst term is the state tracking error and the second term is the control
magnitude, which is related to energy consumption.
When the cognitive control is applied, the resulting cost function can be
obtained according to Equation 10.2
JxxQxx uR
uG
k
kk
T
kk k
T
kk
=− −+ +
→∞
[( )(
)]

β (10.10)
where:
x
k
is the information gap in the
k
th communication cycle
β
is a scalar
e objective of the optimal control is to realize ecient train operations and the
minimum energy consumption through decreasing the information gap.
10.4.2 Channel Model in MIMO-Enabled WLANs
In order to optimize the performance of CBTC systems, we build a nite-state
Markov channel (FSMC) model based on the real eld measurements. FSMC mod-
els have been widely accepted in the literature as an eective approach to characterize
wireless channels, including high-speed railway channels [21], satellite channels [22],
and Rayleigh fading channels [23]. In FSMC models, the SNR range of the received
signal can be partitioned into nonoverlapping levels. en the received SNR can be
modeled as a random variable evolving according to an FSMC with state transition
probabilities, which can be obtained from real eld channel measurements.
Due to the eect of large-scale fading, the amplitude of SNR depends on the
distance between the transmitter and the receiver. It is obvious that the SNR is
usually high when the receiver is close to the transmitter; whereas it is low when
the receiver is far away from the transmitter. As a result, the transition probability
from the high channel state to the low channel state is dierent when the receiver
is near or far away from the transmitter, which means that the Markov state transi-
tion probability is related to the location of the receiver. erefore, only one state
transition probability matrix, which is independent of the location of the receiver,
may not accurately model the channels.
224 Advances in Communications-Based Train Control Systems
As a result, we divide the communication coverage of one access point (AP)
into
L
intervals. For each interval, we use Lloyd–Max method to partition the
SNR amplitude into several levels, which are nonuniformly distributed. e non-
uniformed partitioning can be useful to obtain more accurate estimates of system
performance measures [24]. Specically,
P
l
lL
,{1,2,...,
}
is the state transition
probability matrix corresponding to the
l
th interval, and the relationship between
the transition probability and the location of the receiver can be built. en,
p
nj
l
,
is the state transition probability from state
s
n
l
to state
s
j
l
in the
l
th interval. As a result,
the state probabilities and the state transition probabilities can be dened as follows:
pP s
pP
ss
pnj
n
l
r
l
k
l
n
l
nj
l
r
l
k
l
n
l
k
l
j
l
nj
l
=
=
=−
+
{=}
{= =}
0,
,1
,
γ
γγ|
|if || >1
1, {1,2,3,..., }
=1
,
j
N
nj
l
pn N
=∀
(10.11)
where:
p
n
l
is the probability of being in state
n
in the
l
th interval
γ
k
l
is the channel state in time slot
k
in the
l
th interval
In this chapter, a MIMO-enabled WLAN is used in CBTC systems. A MIMO
system can provide two types of gains: diversity gain and spatial multiplexing
gain, which may not be obtained simultaneously. ere is a fundamental trade-o
between diversity gain and spatial multiplexing gain [25]: higher spatial multiplex-
ing gain comes at the price of sacricing diversity. We can achieve the optimal
diversity gain
dr
*
()
with long enough block length as follows:
dr mr
nr
tr
*
() ()
()
=− (10.12)
where:
r
is the spatial multiplexing gain, which is an integer
m
t
and
n
r
are the numbers of transmitting and receiving antennas, respectively,
which correspond to the numbers of the AP antennas and the MS antennas
In particular,
dm
n
tr
max
*
= and
rm
n
tr
max
*
min= {,
}
.
With a multiplexing gain
r
, the data rate
Cr()
and the bit error rate (BER) prob-
ability
br()
can be approximated as [26]
Cr kr
br k
c
p
dr
()
()
()
2
()
=
=
log
γ
γ
(10.13)
where:
k
c
and
k
p
are positive constants for dierent coding schemes
Cognitive Control for CBTC Systems 225
As a result, the data rate and the BER can be obtained when the multiplexing gain
and the diversity gain are obtained through Equation 10.12, which can be used to
derive the performance of the higher layers.
10.4.3 Q-Learning in the Cognitive Control Approach
e overall systems can be considered as a discrete-time event system. As the wire-
less channel is modeled as an FSMC channel, each channel state can only transit
to the adjacent channel states. In order to utilize Q-learning algorithm, the system
states, actions, and rewards should be identied. e RL model in the cognitive
control approach is shown in Figure 10.5.
10.4.3.1 System States and Actions
In the Q-learning model, the cognitive controller on the train should decide if a
hando procedure happens from the connected AP to the other available AP at
each communication cycle, where we assume that the place occupied by a train can
only be covered by two successive APs. As mentioned earlier, the multiplexing gain
System
(the front train)
Cognitive controller
(decision maker)
Environment
(wireless channel)
Dynamic system
ATO
Mobile station
Reward
Agent
(the back train)
Action
System state
Channel state
Agent state
Figure10.5 RL model in the cognitive control approach.
226 Advances in Communications-Based Train Control Systems
needs to be determined to improve the channel conditions. e hando action is
denoted as
a
k
h
at the
k
th communication cycle and the multiplexing gain action is
a
k
m
. en, the current action
aA
k
is
aa
a
kk
h
k
m
= {,
}
.
Corresponding to the actions, the current states should indicate the physical
layer of wireless communications and the hando procedure, such as the SNR levels
of two adjacent APs. Because our FSMC model is related to the distance between
the transmitter and the receiver, the channel state is given as
γ
1
l
k
and
γ
2
l
k
, which are
the SNR levels of two successive APs in the
l
th interval. As a result, the current state
sS
k
is denoted as
s
k
l
k
l
k
= {,
,}
12
γγID , where
ID
is the identication number of the
current associated AP. When
ID
changes, the hando procedure happens.
10.4.3.2 Reward Function
When each action is taken, the system earns deterministic reward, which is used to
demonstrate the eects of the action on the system. In our Q-learning model, the
linear quadratic cost should be minimized according to the optimization objective
shown in Equation 10.10, which includes the guidance trajectory tracking error, the
control magnitude, and the information gap. As a result, we take the reciprocal of
the linear quadratic cost as the reward function. Hence, the communication delay
may aect the performance of the tracking, which can cause frequent acceleration
and deceleration and increase the energy assumption. ere are two kinds of com-
munication delay: delay with handos (hando latency) and delay without hand-
os. erefore, we should rst study the communication delay without handos.
IEEE 802.11g WLANs are applied as the main method of the train–ground
communication in CBTC systems, where carrier sense multiple access/collision
avoidance (CSMA/CA) is used in the media access control (MAC) layer. When the
train is running with high speed, the wireless channel can be aected due to the
Doppler frequency shift, reections, and other factors. e performance of the phys-
ical layer will be decreased, such as packet loss rate (PLR). e packet loss will bring
retransmission of data packets according to the automatic repeat request scheme
with CSMA/CA, which can cause time delay. According to Equation 10.13, the
BER can be obtained. en, the corresponding frame error rate (FER) is derived.
fr br
L
f
() 1[1()]=− (10.14)
where:
L
f
is the length of one frame whose unit is bit
For a given FER, the PLR can be obtained.
pr fr fr() () [1
()
]
1
=−
α
(10.15)
where:
α
is the times of packet retransmissions
Cognitive Control for CBTC Systems 227
Generally, there are two factors that can cause the packet loss: (1) the packet collision
and (2) the channel error. In the CBTC scenarios, there may not be many trains
occupying one AP’s coverage. As a result, there may not be packet collisions. en,
the only factor we are concerned about in this chapter is the channel error that can
lead to packet retransmissions. First of all, we can derive that the time delay with-
out retransmission (
α=0
) is caused by the random access scheme according to the
CSMA/CA method in 802.11g [27].
Ta
DifsTime aDataTimeaSifsTime
anACKTimeaPropTime
r
0
=+ +
++
(10.16)
where:
aDif
sTim
e
is the period of distributed interframe space (DIFS)
aDataTime
is the time for the transmitter to send a data frame and determined
by the ratio of the length of the frame
L
f
and the current data rate
Cr()
aSif
sTime is the period of short interframe space (SIFS)
anACKTime
is the time for the transmitter to send an acknowledge frame
aPropTime
is the propagation time
When the packet loss happens, the retransmission scheme is triggered with a backo
time, which is uniformly distributed in the range
[0,1]
CW
, where
CW
is the con-
tention window. For the rst backo,
CW
is initialized as
CW
mi
n
. en, each trans-
mission attempt will double
CW
until it reaches
CW
ma
x
. In the 802.11 standard [27],
CW =
{16,32,64,128,256,512,
1024}
and
CW
CW
mi
nm
ax
= =
16, 1024. en the
general expression of
CW
can be denoted as
CW
α
α
α
α
=
≤≤
>
+
2,
06
2, 6
4
10
(10.17)
where
α
is the times of retransmissions.
As a result, the backo time can be denedas
BackoffTimeRando
ma
SlotTime
αα
=−×
([0,1])CW (10.18)
where:
aSlotTime
is a constant time corresponding to IEEE standard of 802.11g [27]
Hence, we can derive the time delay caused by retransmissions.
TB
ackoffTimeaPropTimes
aDifsTimeaDataTime a
r
i
i
α
α
α
=+
+++
=0
()
( SSifsTime)
(10.19)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.150.231