226 ◾ Advances in Communications-Based Train Control Systems
needs to be determined to improve the channel conditions. e hando action is
denoted as
k
at the
th communication cycle and the multiplexing gain action is
. en, the current action
is
a
h
m
= {,
.
Corresponding to the actions, the current states should indicate the physical
layer of wireless communications and the hando procedure, such as the SNR levels
of two adjacent APs. Because our FSMC model is related to the distance between
the transmitter and the receiver, the channel state is given as
1
l
and
2
l
, which are
the SNR levels of two successive APs in the
th interval. As a result, the current state
is denoted as
k
l
k
l
k
= {,
12
γγID , where
is the identication number of the
current associated AP. When
changes, the hando procedure happens.
10.4.3.2 Reward Function
When each action is taken, the system earns deterministic reward, which is used to
demonstrate the eects of the action on the system. In our Q-learning model, the
linear quadratic cost should be minimized according to the optimization objective
shown in Equation 10.10, which includes the guidance trajectory tracking error, the
control magnitude, and the information gap. As a result, we take the reciprocal of
the linear quadratic cost as the reward function. Hence, the communication delay
may aect the performance of the tracking, which can cause frequent acceleration
and deceleration and increase the energy assumption. ere are two kinds of com-
munication delay: delay with handos (hando latency) and delay without hand-
os. erefore, we should rst study the communication delay without handos.
IEEE 802.11g WLANs are applied as the main method of the train–ground
communication in CBTC systems, where carrier sense multiple access/collision
avoidance (CSMA/CA) is used in the media access control (MAC) layer. When the
train is running with high speed, the wireless channel can be aected due to the
Doppler frequency shift, reections, and other factors. e performance of the phys-
ical layer will be decreased, such as packet loss rate (PLR). e packet loss will bring
retransmission of data packets according to the automatic repeat request scheme
with CSMA/CA, which can cause time delay. According to Equation 10.13, the
BER can be obtained. en, the corresponding frame error rate (FER) is derived.
fr br
L
f
() 1[1()]=− − (10.14)
where:
f
is the length of one frame whose unit is bit
For a given FER, the PLR can be obtained.
pr fr fr() () [1
]
1
−α
(10.15)
where:
is the times of packet retransmissions