7. Novel Communications-Based Train Control System with Coordinated Multipoint Transmission and Reception (5/7)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Novel Communications-Based Train Control System ◾ 135

7.5.2.1 Reduced-State Bellman Equation

Instead of working on the global state space sk hkhkhk

() {(), (),(),

(),

ξε

(),(

)}kk

, we shall derive a reduced-state Bellman equation from Equation 7.24

using partitioning of the control policy π, which is based on partial system state

ε only. Specically, we partition a unichain policy π into a collection of actions as

follows:

Denition (conditional actions): Given a policy π, we dene

πε π

() {(

):=

hhhhhhh

BB BB BB BB

=∀

(, ,,,,) ,,

}

1234 1234

ξε ξ

as the collection of actions under a

given ε for all possible

BB BB12

,,ξ

. e policy π is therefore equal to the

union of all the conditional actions, at is,

ππε



()

ρε εδεεεδ

εε

δε

[]

′

[]

′





















′

∑



Vr Pr V()=,() |,()

()

min

(7.25)

where:



Vs() [()|

]εε

= is the conditional potential function

δε πε

()

()=

is the collection of actions under a given



rs s

εδεδ

()

[]

{}

is the conditional per-stage reward



EPrs ss[|,()] [(|,()

|)]

′

εε

εδ

εd is the conditional average transition kernel

A solution to this equation is still very complex due to the huge dimensionality of

the state space, and brute force value iteration or policy iteration [11] has exponen-

tial memory size requirement. As a result, it is desirable to obtain an online and

low-complexity solution to the problem.

7.5.2.2 Online Value Iteration AlgorithmviaStochastic

Approximation

In this section, we shall derive a low-complexity (but optimal) solution by pro-

posing an online value iteration to solve the reduced-state Bellman equation in

Equation 7.25. We shall propose an online sample path-based iterative learning

algorithm to estimate the performance potential and the optimal policy.

Dene a vector mapping

: with the

th component mapping (1 ≤ i ≤ K) as

Vr Pr V() ,( )|,( )()

()

























∑

δε

εδεεεδ

εε

min















(7.26)

where:

is the total number of velocity tracking error states

Because the potential is unique up to a constant, we could set

VV J() ()



−=

for

a reference state

(1 ≤ i ≤ K). Let

{(

0),,()

,}εε

k be the sample path, that is, the

corresponding realizations of the system states. To perform the online value iteration,

136 ◾ Advances in Communications-Based Train Control Systems

we divide the sample path into regenerative periods. A regenerative period is dened as

the minimum interval that each

state is visited at least once. Let

()

and



be the

times that

is visited and the estimated performance potential in the

th regenerative

period, respectively. Let

kn li

=+ =

{1:>

,()1}

min

for d

 0.

en

the sample path in the

th regenerative period is

{(

), ,( 1)}

εεnn



− .

At the begin-

ning of the

th regenerative period (ntn

≤≤ −

), initialize the following dummy

variables as



()

, Si



() 0

, and li

()

. Within the

th regenerative period,

we adopt policy

≠

. After observing the velocity error state

ε(1

at the end of the

th slot, update the following metric of the visited state

ε()

k according to

Si Sirs s

Si SiVk k





() () ,()

() () () ()

[]

εε

if εε

li li() () 1=+











(7.27)

At the end of the

th regenerative period, using stochastic approximation algo-

rithm [29], we update the estimated potential for the

(1)

th regenerative period,

which is



() ()

()

εε ες

(7.28)

where:

()

εε

=− +−

















+−



()

ε (7.29)

is the step size of the stochastic approximation algorithm

is the reference state

Without loss of generality, we set the state that the velocity track error with the

highest value is the reference state. Accordingly, we update the policy for the

(1)

regenerative period, which is given by

()

argminΠ



(7.30)

erefore, we could construct an online value iteration algorithm as follows:

Step 1 (initialization): Set

k = 0

and start the system at an initial state

)

. Set

d = 0

and initialize the potential



and policy

()

= argminTV



in the 0th

regenerative period.

Step 2 (online potential estimation): At the beginning of the

th regenerative

period, set



()

, Si



()

, and

li i

() 0,

=∀

. Run the system with policy

to n

d +

−

1 and accumulate the information of the visited

from slot to

slot according to Equation 7.27. At

d +

−

, update the estimated potential



d +1

for the

(1)

th regenerative period according to Equation 7.28.

Novel Communications-Based Train Control System ◾ 137

Step 3 (online policy improvement): Update the policy

for the

(1)d +

regenerative period according to Equation 7.30.

Step 4 (termination): If

||<



−σ

, stop; otherwise, set

dd:= 1+

and go to

step 2.

7.5.3 Optimal Guidance Trajectory Calculation

An optimal CoMP cluster selection and hando decision policy, which minimizes

the linear quadratic cost function, can be derived from the algorithm presented

above. Recall that the linear quadratic cost function is dened as the sum of track-

ing error and control magnitude in Equation 7.4. Due to the hando latency in the

system, the tracking error always exists when the train travels along the guidance

trajectory. In order to further reduce the tracking error, the second part of our

scheme is to recalculate the guidance trajectory. e newly calculated guidance

trajectory should take full consideration of the tracking error caused by hando

latency. e calculation approach is presented in the following.

In traditional xed-block track circuit-based train control systems, optimal guid-

ance trajectory calculation has been studied for many years. e earliest and most

noticeable work is from the Scheduling and Control Group in North Australia [30].

ey conducted theoretic research and project on optimal train travel trajectory

considering energy saving and trip time. e results show that the optimal guid-

ance trajectory could be divided into four phases: traction, speed holding, coasting,

and braking, as illustrated in Figure 7.5. e shift and velocity of four phases are

SSSS

,,, and

vvvv

,,, , respectively. Based on that, many researchers investigated

the optimal travel trajectory with the aim to nd the switch point [30]. However,

most existing works focus on traditional xed-block track circuit-based train control

systems, where the train travel trajectory is not aected by the front trains.

Traction

Speed holding

Coasting

Braking

Shift

Velocity

Figure7.5 Optimal train guidance trajectory.

138 ◾ Advances in Communications-Based Train Control Systems

In this chapter, we propose an optimal guidance trajectory calculation in CBTC

systems with CoMP. e scheme takes full consideration of the impacts from the

front train. e scheme is described as follows:

Step 1: Initialize the basic parameters, train mass

, train traction power

trac

train braking power

brac

, trip time

, and trip distance

. Set the current

train velocity to be

, the current train position to be

cur

= 0

, and the

current traveled time to be

cur

= 0

Step 2: Compute the remaining trip time,

remaintrip trip cu

=−

, and the remain-

ing trip distance,

remaintrip trip cu

=−

. With

remaintrip

, v

, S

remaintrip

, and the basic

parameters, the optimal guidance trajectory can be calculated as follows:

As shown in Figure 7.5, the optimal guidance trajectory could be divided

into four phases: traction, velocity holding, coasting, and braking. erefore,

given the already known traction power and braking power, the optimal guid-

ance trajectory can be determined if the holding velocity

and the velocity v

that indicates the end of the costing phase are obtained. With this objective,

the optimization problem can be formulated as

min fMvv FS

vvv

=−+⋅

≤≤

−

()

123

fric

remaintrip

s.t

FFM

trac fric brak

remaintrip

/2/2/

−

rracfricbrak

///

32 3

−

(7.31)

In this equation,

f is the energy consumed to accelerate the train from

, and keep it travel at speed

for a distance of

. e second constraint

is the trip distance constraint, which includes four parts representing the

distance traveled in the traction phase, the distance traveled in the velocity

holding phase, the distance traveled in the coasting phase, and the distance

traveled in the braking phase, respectively. e sum of these four distances

should be equal to the remaining trip distance. e third constraint is the trip

time constraint, which includes four parts representing the time traveled in

the traction phase, the time traveled in the velocity holding phase, the time

traveled in the coasting phase, and the time traveled in the braking phase,

respectively. e sum of these four parts should be equal to the remaining

trip time.

Step 3: If

remaintrip

0, go to step 4. Otherwise, compare the current train

velocity

with the reference train velocity

ref

on the calculated guidance

trajectory. If

vv v

−≥

ref

∆

, where ∆v is the preset velocity error, go to step 2.

Novel Communications-Based Train Control System ◾ 139

Otherwise,update the train velocity

and train position

according to

the train control model described in Section 7.3.1.

Step 4: e train stops at the destination. Wait until the train starts, and then

go to Step 1.

7.6 Simulation Results and Discussions

In this section, simulation results are presented to illustrate the performance of our

proposed system (Table 7.2). e system optimization process can be divided into

the o-line part and the online part. For the o-line part, the mathematical models

described earlier are used to derive the optimal policy for CoMP cluster selection

and hando decisions. e calculation of the optimal policy using the value itera-

tion algorithm is performed o-line with C language. Once the optimal policy is

obtained, it is stored in a table format. Each entry of the table species the optimal

action (CoMP cluster selection and hando) given the current state (i.e., channel

state, currently used cluster).

For the online part, computer simulation based on NS2.29 is used. Specically,

at each decision epoch, each MT on the train looks up the policy table to nd out

the optimal action corresponding to its current state, and then executes action.

Online looking up tables can be designed with little computational complexity in

practice.

In the simulations using NS2, we consider a scenario with three moving nodes

(MT on the trains) traveling between two stations, A and B. e wayside nodes

(base stations) are deployed along the railway line with an average distance of

600m between two successive base stations. ese three moving nodes depart from

station A successively with a constant interval and stop at station B. In the moving

Table 7.2 Simulation Parameters

Notation Denition Value

M Train mass 50,000 kg

Extra delay caused by handoff

9 μs

Transmission power 100 mW

ack

ACK packet size 20 bytes

packet

Data packet size 200 bytes

L Number of channel states 4

K Number of tracking error states 5

T Communication period 100 ms

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7. Novel Communications-Based Train Control System with Coordinated Multipoint Transmission and Reception (5/7)

Create new playlist

Sign In

Sign Up

Table of Contents for
7. Novel Communications-Based Train Control System with Coordinated Multipoint Transmission and Reception (5/7)