Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10

Robot Trajectory Generation in Joint Space

Abstract

In order to generate desired robot trajectory, this chapter proposes a different method from the previous works like inverse kinematics and dynamic time warping. The classic hidden Markov model (HMM) is modified such that it is more feasible to generate trajectory in joint space. We introduce a new auxiliary output in HMM to help the training process. The Lloyd's algorithm is modified for HMM, such that it can solve the problems in joint space learning. The proposed method is validated with a two-link planar robot and a four degree-of-freedom (4-DoF) exoskeleton robot.

Keywords

Joint space; Tarjectory generation; Hidden Markov model

10.1 Codebook and key-points generation

The dynamics of a n-link robot in joint space can be written as

$M (q) \overset{\cdot \cdot}{q} + c (q, \overset{\cdot}{q}) \overset{\cdot}{q} + g (q) + F (\overset{\cdot}{q}) = τ$

(10.1)

where $q \in ℜ^{n}$ denotes the links angles, $\overset{\cdot}{q} \in ℜ^{n}$ denotes the links velocity, $M (q) \in ℜ^{n \times n}$ is the inertia matrix, $c (q, \overset{\cdot}{q}) \in ℜ^{n \times n}$ is the centripetal and Coriolis matrix, $g (q) \in ℜ^{n}$ is the gravity vector, $F \in ℜ^{n}$ is the frictional terms, and $τ \in ℜ^{n}$ is the input control vector.

The object is to generate the desired trajectory $q^{⁎}$ from the demonstrations in the joint space $Q = {[Q_{1} \dots Q_{n}]}^{T}$ , $Q_{i} = {[X_{i}^{1} \dots X_{i}^{m}]}^{T}$ . Here, the trajectory $X_{i}^{r}$ defines ith joint and rth trajectory; it is defined as $X_{i}^{r} = {q_{i} (l)}$ , $l = 1 \dots T_{i}^{r}$ , $T_{i}^{r}$ is the total sample number of the joint angles. There are m trajectories for each joint. The trajectory number m is not necessary large, because a human cannot repeat the same motion only so many times and this may cause a repeat calculation in HMM. The data length $T_{i}^{r}$ is different from one demonstration to another, because of a different speed, and a starting/ending time.

Each joint for each demonstration has its own codebook defined as $C_{i}$ , $i = 1 \dots n$ . A codebook can be regarded as a $N_{i}$ dimension vector, i.e., $C_{i} = {[c_{i 1} \dots c_{i N_{i}}]}^{T}$ . The codebook dimension $N_{i}$ is selected by prior knowledge of the trajectory's geometry.

We use Lloyd's algorithm to train the codebook $C_{i} (t)$ . Here, t is noted as the training time. The initial codebook is defined as $C_{i} (1)$ . A bad initial condition may reach local minima. A heuristics method [160] is used to find a better $C_{i} (1)$ . Since the heuristics method needs a lot time, $C_{i} (1)$ can also be selected randomly from $X_{i}^{r}$ .

The objective of using Lloyd's algorithm to create the codebook is to minimize a quantization error with certain data distribution. We need nearest-neighbor and centroid conditions, which are commonly used in Lloyd's algorithm. For one point $q_{i} (l_{1})$ in the trajectory, we want to find the nearest codebook element by calculating

$\min_{1 ⩽ j ⩽ N} | q_{i} (l_{1}) - c_{i j} |, l_{1} = 1 \dots T, i = 1 \dots n$

(10.2)

If the nearest codebook element is $c_{i k}$ , the region $R_{i j}$ ( $i = 1 \dots n$ , $j = 1 \dots N_{i}$ ) is defined as

$R_{i k} = {q_{i} (l_{1})} = [r_{i k 1} \dots r_{i k p_{i k}}], k = 1 \dots s_{i}^{r}$

(10.3)

where $p_{i k}$ is the length of the region $R_{i k}$ , $s_{i}^{r}$ is the region number of the joint i and the demonstration r.

Obversely, the center of the region $R_{i j}$ should be $c_{i j}$ . The new center of $R_{i j}$ can be calculated as

$c_{i j} = \frac{1}{p_{i j}} \sum_{l = 1}^{p_{i j}} r_{i j l}$

(10.4)

It is the centroid condition. The normal Lloyd's algorithm uses selected points to construct the centers [42]. We calculate the mean value of all points in the region $R_{i j}$ . The advantage is that it can be updated online.

With (10.3) and (10.4), we can calculate $c_{i j} (t)$ recursively. We use the following quantization error to design a stop criterion. The average quantization error for $X_{i j} = {q_{i} (l)}$ , $l = 1 \dots T$ , is defined as

$ε_{i j} (t) = \frac{1}{T} \sum_{l = 1}^{T} | q_{i} (l) - c_{i j} (t) |$

(10.5)

where t is the recursive calculation times. We define a relative quantization error as

$Δ ε_{i j} (t) = | \frac{ε_{i j} (t + 1) - ε_{i j} (t)}{ε_{i j} (t + 1)} |$

(10.6)

The codebook calculation is stopped when the relative error (10.6) is small enough as

$Δ ε_{i j} (t) ⩽ \bar{ε}$

(10.7)

where $\bar{ε}$ is a prior defined upper bound.

Remark 10.1

We use Lloyd's algorithm to quantize the robot joint trajectories and obtain the codebook. Since we use the minimum distance as in (10.2), the codebook obtained from the above recursive method is optimal; see [42]. This means each joint angle $q_{i} (l)$ has a minimal distance to its codebook element. The quantization is in Y-axis (joint angle axis). We do not consider time-axis. So the similarity of the trajectories with different time or speed can be measured directly. The advantage of this method is that we do not need dynamic time warping (DTW) as [140].

In order to train the hidden Markov model, we need key-points and observation symbols. These will be generated from the above codebook. The key-points are the start point, the end point o, and the center point in the time-axis of the codebook. If the time index of $c_{i j}$ in (10.4) is defined as $t_{i j}$ ( $i = 1 \dots n$ , $j = 1 \dots N_{i}$ ), then the key-points are calculated as

$k_{i j} = t_{i j} + \frac{t_{i (j + 1)} - t_{i j}}{2}$

(10.8)

For each demonstration the key-points set are $K_{i}^{r} = {[k_{i 1} \dots k_{i, v_{i}^{r}}]}^{T}$ ( $i = 1 \dots n$ , $r = 1 \dots m$ ), $v_{i}^{r}$ is the key point number in $K_{i}^{r}$ . Since the key-point is in the center of the $R_{i j}$ and the starting and ending points of the trajectory are also key-points, $v_{i}^{r} = s_{i}^{r} + 2$ .

The observation symbols are the joint angles at the time of key-points. When they are less than the codebook, we use the codebook values. $O_{i}^{r} = {[o_{i 1} \dots o_{i N_{i}}]}^{T}$ ( $i = 1 \dots n$ , $r = 1 \dots m$ ):

${\begin{matrix} o_{i j} (, q) = c_{i, j + 1} & if c_{i j} ⩽ q_{i}^{r} (k) ⩽ c_{i, j + 1} \\ o_{i j} (s,) = j + 1 & otherwise \end{matrix}, j = 1 \dots N_{i}, k = 1 \dots v_{i}^{r}$

(10.9)

where s is the symbol of the observation, q is the value of the codebook.

Lloyd's algorithm allows us to set the dimensions of the key-points and the observation symbols are the same as that of the codebook. We avoid the manual tuning process as in [42,140,152].

We use the following example to explain how to use Lloyd's algorithm to generate the codebook, key-points, and observation symbols.

Example 10.1

A robot draws a circle. The angle of Joint-1 is shown in Fig. 10.1. We first show how the codebook number N will affect the results. The two square waves corresponds to $N = 3$ and $N = 4$ . When $N = 3$ , there are 3 quantizations in the Y-axis. They generate 5 regions in the trajectory, $s_{1}^{1} = 5$ , $R_{11} \dots R_{15}$ . The key-points are in the center of each segment of the square waves. With the starting point and ending point, the key-point number of $K_{1}^{1}$ is 7, $v_{1}^{1} = s_{1}^{1} + 2 = 7$ . When $N = 4$ , the region number in (10.3) is $s_{1}^{1} = 7$ , and the key-point number is $v_{1}^{1} = 9$ . If there are 3 demonstrations, then $r = 1, 2, 3$ ; see Fig. 10.2. The velocities are different; usually the dynamic time warping is needed to put them together in the time-axis. Here, we use Lloyd's algorithm; the key-point number of these three demonstrations are the same, $N = 3$ , $s_{1}^{1} = s_{1}^{2} = s_{1}^{3} = 5$ . Although the key-points in the time-axis are different, the observation symbols in the Y-axis are similar.

Figure 10.1 The codebook number N = 3 (broken line) and N = 4 (continuous line).

Figure 10.2 Different demonstrations with similar observation symbols.

The following algorithm explains the calculation process of the codebook $C_{i}$ , key-points $K_{i}^{r}$ , and the observation symbols $O_{i}^{r}$ .

Algorithm 1

Algorithm 10.1

Calculation of codebook, key-points, and observation symbols.

Obtain demonstrations $Q = {[Q_{1} \dots Q_{n}]}^{T}$ , $Q_{i} = {[X_{i}^{1} \dots X_{i}^{m}]}^{T}$

Set the codebook size N

For $i = 1$ to n

For $r = 1$ to m

$t = 1$

While $Δ ε_{i j} (t) ⩽ \bar{ε}$

For $j = 1$ to $N_{i}$

Calculate the region $R_{i j} (t)$ with (10.3)

Calculate center $c_{i j} (t)$ with (10.4)

Calculate the relative quantization error with (10.6)

End for j

$t = t + 1$

End while

calculate key-points $K_{i}^{r}$ with (10.8)

calculate of the observation symbols $O_{i}^{r}$ with (10.9)

End for r

End for i

10.2 Joint space trajectory generation with a modified hidden Markov model

After the codebook, key-points, and the observation symbols are generated by Lloyd's algorithm, they are used for training a discrete Hidden Markov Model (HMM). This model can generate a desired trajectory in the joint space as we want; see Fig. 10.3.

Figure 10.3 Robot trajectory generation using HMM: (A) joint space, (B) task space.

A HMM can be understood as a finite state machine where at any instant N, the different states are defined as $S = {S_{1}, S_{2}, S_{3} . . . S_{N}}$ . At regular intervals, the HMM may change its states. The time associated with the changes are denoted by $t = 1, 2, . . .$ . The changed states are denoted by $h_{t}$ . In the case of process modeling, the change (transition) probabilities only depend on the current state $t_{k}$ and its previous states $t_{k - 1}$ , $t_{k - 2}, \dots$

$P (h_{t} = S_{j} | h_{t - 1} = S_{k}, h_{t - 2} = S_{k}, . . .) = P (h_{t} = S_{j} | h_{t - k} = S_{k})$

(10.10)

where $t = 1, 2, . . .$ , $k = t - 1, t - 2, . . .$ .

Since the right-hand side of the above equality is independent of t, each transition probability defined $a_{i j}$ as

$a_{i j} = P (h_{t} = S_{j} | h_{t - 1} = S_{i}), 1 ⩽ i, j ⩽ N$

(10.11)

Now we define the transition probability distribution matrix as $A = {a_{i j}}$ . If each state can lead to any other state in a single transition, $a_{i j} > 0$ .

The number of distinct observation is defined as M. The symbols corresponding to the observable output of the system is defined as

$V = {v_{1}, v_{2}, . . ., v_{M}}$

(10.12)

The distribution of observing the symbol is defined as $B = {b_{j} (k)}$ . This probability distribution of the symbol k in state j is

$b_{j} (k) = P [v_{k} | h_{t} = S_{j}], 1 < j < N, 1 < k < M$

(10.13)

A graphical representation of a HMM is Fig. 10.4. This is a graphical representation of the HMM of Fig. 10.2. Here, the codebook number $N_{1} = 4$ , the hidden state number $v_{1} = 7$ . The configuration of this HMM is the well-known left–right topology.

Figure 10.4 A hidden Markov model of the trajectory in Fig. 10.2.

The probability distribution of the initial state is defined as $π = {π_{i}}$ :

$π_{i} = P [h_{1} = S_{i}], 1 < i < N$

(10.14)

Given suitable for N, M, A, B, and π, a HMM generates a sequence of observations as $O = [O_{1}, O_{2}, . . . O_{T}]$ according to the following algorithm. Each observation $O_{i}$ is a symbol of the set V in (10.12):

1. Select an initial state $h_{1} = S_{i}$ according to the initial distribution $π_{i}$ . Set time $t = 1$ .
2. Select $O_{t} = v_{k}$ according to the observation probability distribution of the symbols $S_{i}$ , i.e., $b_{i} (k)$ .
3. Make a transition to the new state $h_{t + l} = S_{j}$ according to the probability distribution of $S_{i}$ , i.e., $a_{i j}$ .
4. Set the time interval $t = t + 1$ , go to Step 2, until $t = T$ .

With the above procedure, a HMM can be represented as

$λ = (A, B, π)$

(10.15)

For pth joint, the HMM $λ_{i}$ can be considered a generalization of the following mixture model [116]:

$λ_{p} = (A_{p}, B_{p}, π_{p})$

(10.16)

Since the difference in joint space is much bigger than task space with respect to motion speed and starting point, and the demonstrations cannot be many. The HMM model (10.16) cannot generate a desired trajectory with good accuracy in the joint space. We modify the HMM (10.16) into the following form:

$λ_{p} = (A_{p}, B_{p}, π_{p}, {\hat{q}}_{p})$

(10.17)

where ${\hat{q}}_{p} = {[{\hat{q}}_{p 1} \dots {\hat{q}}_{p N p}]}^{T}$ is the output of the HMM (scaled joint angle), which is defined in (10.1).

The output of HMM ${\hat{q}}_{p}$ is also regarded as one demonstration, and is sent to HMM again for its training.

The number of states $A_{p}$ relates to the number of key-points. Although the states in HMM are hidden, and the algorithm of the codebook and key-points generation represents the physical significance of the trajectory. The state of pth joint is defined as $S_{p} = {[s_{p 1}, . . . s_{p v_{p}}]}^{T}$ . The hidden state number $v_{p}$ is the maximum of the key-point number:

$v_{p} = \max_{1 ⩽ r ⩽ m} {v_{p}^{r}}$

(10.18)

where m is the demonstration number, $v_{p}$ is the hidden state number, and $N_{p}$ is the codebook number.

We use the key-point number $v_{p}^{r}$ as the number of states of HMM. Since $v_{p}^{r}$ depends on the codebook $C_{p}$ , the m trajectories have the same key-point number, which is defined as $v_{p}$ . The missing points are assigned with the correspond values of the other demonstration. The observation symbols are $O_{p} = {[o_{p 1}, \dots o_{p N_{p}}]}^{T}$ ( $p = 1 \dots n$ ).

In order to train the model (10.17) with the data generated the Lloyd's algorithm in the last section, we use the following four steps:

1) Given a sequence of observations $O = O_{1} O_{2} . . . O_{T}$ and the model $λ_{p} = (A_{p}, B_{p}, π_{p}, q_{p})$ , calculate the probability of the sequence of observations $P (O | λ_{p})$ .

The direct way to calculate the probability of $O_{1} O_{2} . . . O_{T}$ with respect to $λ_{p}$ is enumerating every possible state sequence of length T (the number of observations). Considering a sequence of fixed states $H = h_{1} h_{2} . . . h_{T}$ , the probability of the observation sequence $O = O_{1} O_{2} . . . O_{T}$ for this sequence is

$P (O | H, λ_{p}) = \prod_{t = 1}^{T} P (O_{t} | h_{t}, λ_{p})$

(10.19)

Under the independence assumption for the observations, the joint probability of O and H is

$P (O | H, λ_{p}) = b_{h 1} (O_{1}) b_{h 2} (O_{2}) \dots b_{h T} (O_{T})$

(10.20)

The probability of O is calculated by adding this joint probability over all possible state sequences $h_{i}$ :

$\begin{matrix} P (O | λ_{p}) = π a_{h_{1} h_{2}} a_{h_{2} h_{3}} \cdot \cdot \cdot a_{h_{T - 1} h_{T}} \\ = \sum_{h_{1} h_{2} . . . h_{T}} π_{i} b_{h 1} (O_{1}) a_{h_{1} h_{2}} a_{h_{2} h_{3}} \cdot \cdot \cdot a_{h_{T - 1} h_{T}} b_{h_{T}} (O_{T}) \end{matrix}$

(10.21)

We use the following feedforward–feedback process to calculate (10.21).

We define $α_{t} (i)$ as

$\begin{matrix} α_{t} (i) = P (O_{1}, O_{2}, . . . O_{t}, (h_{t} = S_{i} | λ_{p})) \\ β_{t} (i) = P (O_{t + 1}, O_{t + 2}, . . . O_{T}, (h_{t} = S_{i} | λ_{p})) \end{matrix}$

(10.22)

where $α_{t} (i)$ is the probability of the sequence of partial observation (up to time $t < T$ ) and the state $S_{i}$ . $β_{t} (i)$ is the probability of the partial observation sequence from $t + 1$ .

The feedforward process is as follows: a) start from $α_{1} (i) = π_{i} b_{i} (O_{1})$ ; then

$a_{t + 1} = [\sum_{i = 1}^{N} α_{t} (i) a_{i, j}] b_{j} (O_{1})$

finally, $P (O | λ_{p}) \approx \sum_{i = 1}^{N} α_{t}$ .

The feedback process is as follows: b) start from $β_{T} (i) = 1$ ; then

$β_{t} (i) = \sum_{j = 1} a_{i j} b_{j} (O_{t + 1}) β_{t + 1} (j)$

finally, $P (O | λ_{p}) \approx \sum_{j = 1} β_{t + 1} (j) π_{i} b_{i} (O_{1})$ .

2) Find a sequence $S_{i} = {[s_{i 1}, . . . s_{i v_{i}}]}^{T}$ such that the probability of occurrence of the sequence $O = O_{1}, O_{2}, . . ., O_{T}$ is an optimal sequence in the sense of

$δ_{T} (i) = \max_{h_{1}, h_{2}, . . ., h_{T - 1}} P (h_{1} h_{2} . . . h_{T} = i, O_{1} O_{2} . . . O_{T} | λ_{p})$

(10.23)

We need the highest probability through a single path. Considering the first t observation:

$δ_{t + 1} (j) = [\max δ_{t} (i) a_{i j}] b_{j} (O_{t + 1})$

(10.24)

In order to maximize the equation at each $t, j$ , we use the following recursive procedure: a) start from $δ_{t} (i) = π_{i} b_{i} (O_{1})$ ; b) then $δ_{t} (j) = \max [δ_{t - 1} (i) a_{i j}] b_{j} (O_{t})$ ; c) finally,

$S_{T}^{⁎} = \underset{1 ⩽ i ⩽ N}{\arg \max} [δ_{T} (i)]$

This is to find maxim possible state in the codes book, i.e., we want to find

$\max_{1 ⩽ l ⩽ v_{i}} {b_{l j}^{i}}, j = 1 \dots N_{i}$

The index of the maximum $b_{l j}^{i}$ is $s c_{j}$ . Then we obtain the symbol set ${s c_{1} \dots s c_{N_{i}}}$ .

3) Use $P (O | λ_{p})$ in Step 1 and the optimal sequence in Step 2 to train the HMM. This means training the $λ_{p}$ with $S_{p}$ to maximize $P [O_{p} | λ_{p}]$ .

The value of the state in the instant t is defined as $r (t)$ . The state transition matrix $A_{p} = {a_{l j}^{p}}$ :

$a_{l j}^{p} = P [r (t + 1) = s_{p j} | r (t) = s_{p l}], 1 ⩽ l, j ⩽ v_{p}$

(10.25)

(10.25) represents the probability of being in the state $s_{p j}$ at time $t + 1$ given the state $s_{i l}$ at time t. The initial elements $a_{l j}^{i}$ of the state transition matrix $A_{i} (1)$ is selected as uniform distribution as

${\begin{matrix} \sum_{j} a_{l j}^{i} = 1 & l < j \\ a_{l j}^{i} = 0 & otherwise \end{matrix} i = 1 \dots n$

The observation probability matrix $B_{p} = {b_{l j}^{p}}$ is

$b_{l j}^{p} = P [o_{p l} at t | r (t) = s_{p j}], 1 ⩽ l ⩽ N_{p}, 1 ⩽ j ⩽ v_{p}$

(10.26)

This means the states cannot go back, and will jump when the key-points miss from the observed sequences. The initial condition for $b_{l j}^{i}$ is

${\begin{matrix} b_{l j}^{i} = 0 & o_{i l} is the symbol of the demonstration l \\ b_{l j}^{i} = P [o_{i l} () at t | r (1) = s_{i j}] = 1 & otherwise \end{matrix}$

The initial states distribution $π_{p} (1) = [π_{p 1} \dots π_{p N_{p}}] = [1, 0 \dots 0]$ ,

$π_{p j} (1) = P [r (1) = s_{p j}], 1 < j < N_{p}$

(10.27)

We define

$\begin{matrix} ξ_{t} (i, j) = \frac{α_{t} (i) a_{i j} b_{j} (O_{t + 1}) β_{t + 1} (j)}{P (O | λ)} \frac{α_{t} (i) a_{i j} b_{j} (O_{t + 1}) β_{t + 1} (j)}{\overset{N}{\sum_{i = 1}} \overset{N}{\sum_{j = 1}} α_{t} (i) a_{i j} b_{j} (O_{t + 1}) β_{t + 1} (j)} \\ γ_{t} (i) = \overset{N}{\sum_{j = 1}} ξ_{t} (i, j) \end{matrix}$

Here, $ξ_{t} (i, j)$ gives the characteristic of a probability measure, and $γ_{t} (i)$ is the probability of being $S_{T}^{⁎}$ given the sequence of observations O.

By the Baum–Welch algorithm [9], the probabilities $P [r (t + 1) = s_{i j} | r (t) = s_{i l}]$ and the emission distribution $P [o_{i l} at t | r (t) = s_{i j}]$ are

$\begin{matrix} a_{i j} = \overset{T - 1}{\sum_{t = 1}} ξ_{t} (i, j) / \overset{T - 1}{\sum_{t = 1}} γ_{t} (i) \\ b_{j} (k) = \sum_{s . t O_{t} = v_{k}} γ_{t} (i) / \overset{T - 1}{\sum_{t = 1}} γ_{t} (i) \end{matrix}$

4) Decode these symbols into values $q_{p}$ .

The output ${\hat{q}}_{p}$ can be regarded as the decoded values of the observation symbols from the codebook:

${\hat{q}}_{p} = [{\hat{q}}_{i 1} \dots {\hat{q}}_{i N i}] = [o_{i, s c_{1}} (., h) \dots o_{i, s c_{N_{i}}} (., h)]$

The output of the HMM $\hat{q}$ is a discrete state. In order to generate a smooth trajectory, we use spline interpolation. We use the following third-order (cubic) spline:

$s_{t} (x) = a_{t} {(x - {\hat{q}}_{t})}^{3} + b_{t} {(x - {\hat{q}}_{t})}^{2} + c_{t} (x - {\hat{q}}_{t}) + d_{t}$

(10.28)

where $t = 1, 2 \dots T_{i}$ , is the number of the piecewise functions. Since the time index of the output is the key-points $K_{i}^{r}$ in (10.8), the total time of the trajectory $\hat{q}$ is

$T_{i} = \frac{1}{v_{i}} \sum_{j = 1}^{v_{i}} k_{i j}, i = 1 \dots n$

When the discrete point $x_{t}$ is the output of the HMM $x_{i j}$ , the spline function gives a smooth trajectory ${\hat{q}}^{⁎} (t)$ . After time scale, the desired trajectory is

${\hat{q}}^{⁎} (α t)$

(10.29)

where α is time scale factor in joint space. Finally, we give the complete scheme of our algorithm; see Fig. 10.5.

Figure 10.5 Trajectory generation via modified HMM and Lloyd's algorithm.

10.3 Experiments of learning trajectory

We evaluate the effectiveness of the proposed method with two examples: a two-link revolute joint arm (planar elbow manipulator) and our 4-DoF exoskeleton robot (CINVESRobot-1). The experiments are implemented in Matlab without code optimization. The computer is a PC with Intel Core i3 3.30 GHz processor. We use Kevin Murphy's HMM Toolbox [98] (Baum–Welch algorithm [9]) to train the HMM.

We compare our method (LHMM) with the other two approaches: HMM in task space with the inverse kinematics calculation (THMM), and HMM in joint space with the dynamic time warping (JHMM).

10.3.1 Two-link planar elbow manipulator

We use this example to show our method can learn space and time differences in the demonstrations, and to show how the time difference affects the training results. There are about $8 %$ (0.1 second) differences.

The definitions of the two-link robot are shown in Fig. 10.6. By direct calculation, the forward kinematic and the inverse kinematics are

$x = l_{1} \cos q_{1} + l_{2} \cos (q_{1} + q_{2}), y = l_{1} \sin q_{1} + l_{2} \sin (q_{1} + q_{2})$

(10.30)

$\begin{matrix} q_{2} = \tan^{- 1} \frac{\pm \sqrt{1 - D^{2}}}{D}, D = \frac{x^{2} + y^{2} - l_{1}^{2} - l_{2}^{2}}{2 l_{1} l_{2}} \\ q_{1} = \tan^{- 1} (\frac{y}{x}) - \tan^{- 1} (\frac{l_{2} \sin q_{2}}{l_{1} + l_{2} \cos q_{2}}) \end{matrix}$

(10.31)

Figure 10.6 Two-link planar elbow manipulator.

1) We first show how our algorithm works with time difference. We draw three similar broken lines in task space with different velocities; see Fig. 10.7. These lines are almost the same in task space. However, in joint space they are different; see Fig. 10.8.

Figure 10.7 Demonstrations (dotted-line) and generated trajectory (solid-line) in task space with LHMM.

Figure 10.8 Demonstrations (dotted-line) and generated trajectory (solid-line) in joint space with LHMM.

We use $N_{1} = N_{2} = 100$ as the codebook numbers for the two joints. These values are chosen heuristically to balance computation complexity and the modeling accuracy. The key-point numbers of the joints are obtained from Lloyd's algorithm, for Joint-1, $v_{1}^{1} = 62$ , $v_{1}^{2} = 63$ , $v_{1}^{3} = 61$ ; for Joint-2, $v_{2}^{1} = 76$ , $v_{2}^{2} = 77$ , $v_{2}^{3} = 75$ . By (10.18), the state numbers of HMM for each joint are $v_{1} = 63$ , $v_{2} = 77$ . After the HMM is trained, it generates desired trajectories $q_{1}^{⁎}$ and $q_{2}^{⁎}$ with different velocity (10.29). If we select $α = 1$ , the solid lines Fig. 10.8 are the desired joint angles. We hope that the robot moves two times faster than its demonstrations. We select $α = 0.5$ , and is shown by the solid lines in Fig. 10.9. It is interesting to see that the corresponding task space trajectory does not change, which is the solid line in Fig. 10.8.

Figure 10.9 Two times faster demonstrations (dotted-line) and generated trajectory (solid-line) in joint space with LHMM.

2) The second test is to show how our algorithm works for the space difference. The codebook numbers are also $N_{1} = N_{2} = 100$ . The key-point numbers are $v_{1}^{1} = 60$ , $v_{1}^{2} = 51$ , $v_{1}^{3} = 55$ , $v_{2}^{1} = 64$ , $v_{2}^{2} = 68$ , $v_{2}^{3} = 71$ . The generated trajectory in the task space is the solid line in Fig. 10.10. Their trajectories in joint space are shown in Fig. 10.11.

Figure 10.10 Space different demonstrations (dotted-line) and generated trajectory (solid-line) in task space with LHMM and forward kinematic.

Figure 10.11 Space different demonstrations (dotted-line) and generated trajectory (solid-line) in joint space with LHMM.

3) THMM works very well for the speed difference (Fig. 10.8), because the demonstrations in task space are similar, and the inverse kinematics (10.31) are known. However, it does not work well for the space difference (Fig. 10.10), because HMM cannot work well with less demonstrations in task space. On the other hand, our method works in joint space with Lloyd's algorithm. It is not affected by these demonstrations; see Fig. 10.12.

Figure 10.12 Space different demonstrations (dotted-line) and generated trajectory (solid-line) in task space with THMM.

JHMM can work for both speed and space differences. Fig. 10.13 shows the results of the speed difference. The accuracy of JHMM is less than our LHMM; see Fig. 10.10 and Fig. 10.11. The error comes from the dynamic time warping.

Figure 10.13 Demonstrations (dotted-line) and generated trajectory (solid-line) in task space with JHMM.

The modified HMM (10.17) uses the output, and it can learn the joint space trajectories with small number of demonstrations (three training samples), while the original HMM (10.16) cannot generate the joint space trajectories with three demonstrations.

10.3.2 4-DoF upper limb exoskeleton

The computer control platform for our upper limb exoskeleton is CINVESRobot-1. We first draw a 3-D “O” in the task space three times; see Fig. 10.14. At the same time, the joint angles are saved in the computer as the training trajectories; see Fig. 10.15. Then we use our modified HMM to train a model. Since the drawing speeds are different, the data size of the three demonstration are 629, 576, and 615. We select the codebook numbers $N_{1} = N_{2} = N_{3} = 14$ . The key-point numbers are $v_{1}^{1} = 29$ , $v_{1}^{2} = 32$ , $v_{1}^{3} = 30$ . (See also Fig. 10.16.) After the HMM is trained, we used it to generate desired trajectories in the joint space; see the solid lines in Fig. 10.17. These trajectories are sent to the lower level controller. The three motors use PID control to follow the trajectories. As a result, the robot draws another “O” in the task space; see the solid line in Fig. 10.18.

Figure 10.15 Training demonstrations for q₁ with LHMM.

Figure 10.16 Training demonstrations for q₂ with LHMM.

Figure 10.17 Training demonstrations for q₂ with LHMM.

Figure 10.18 Training demonstrations for q₃ with LHMM.

An advantage of our joint space method is the time scale of the desired trajectory can be changed such that the “O” in the task space is faster than the demonstrations. The superhuman performance can be realized easily by our modified HMM. Fig. 10.19 shows the trajectory generated by our method is twice as fast as the training demonstrations; here, $α = 0.5$ in (10.29).

Figure 10.19 CINVESRobot-1 draws “O” in task space with JHMM.

For this example, we do not compare our method LHMM with THMM, because the task space method THMM does not work with three demonstrations. JHMM can work in joint space for different speeds. Fig. 10.19 shows the result in task space. It is almost “O,” but the accuracy is much less than our LHMM, because the high dimension (3D) approximation of DTW is not good.

10.4 Conclusions

The advantage of programming robot by demonstration in joint space is to avoid the inverse kinematics. The disadvantage is that the demonstrations in joint space are strong time-dependent. Wer uses Lloyd's algorithm and modifies HMM to solve the problems in joint space.

We use Lloyd's algorithm to encode and quantize the input vectors. This approach improves accuracy compared with the dynamic time warping method. The traditional HMM is modified such that it can generate trajectories in joint space. Simulation and experimental results are also proposed. The results show that the proposed method is more effective to generate robot trajectory. Although there are few works in joint space, we believe this direct method can be applied in more robots in the future.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10: Robot Trajectory Generation in Joint Space

Create new playlist

Sign In