18.8 DESIGN 3: DESIGN SPACE EXPLORATION WHEN s3 = [1 −0.5]

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

18.8 DESIGN 3: DESIGN SPACE EXPLORATION WHEN s₃ = [1 −0.5]

Figure 18.6 shows the DAG for the polynomial division algorithm based on our timing function choice s₃. We note from the figure that all signals are now pipelined, as indicated by the arrows connecting the nodes. However, we note that there are nodes that do not lie on any equitemporal planes. We have several choices for the timing of nodes that lie between two temporal planes. Alternatively, we could assign a time value equal to either of the temporal planes surrounding the node. In addition, we could assign this node to operate on the negative edge of the clock. The former choice leads to nodes that do not have registers. The latter choice leads to nodes that have registers triggered by the negative edge of the clock. This is the option we follow here.

Figure 18.6 DAG for polynomial division algorithm when s₃ = [1 −0.5], n = 9, and m = 5.

Similar to the two previous designs, we choose a projection vector given by

(18.40)

The corresponding projection matrix P₃ is given by

(18.41)

A point in the DAG given by the coordinates p = [i j]^t will be mapped by the projection matrix P₃ into the point = P₃p. The corresponding to Design 3 is shown in Fig. 18.7. The consists of m − 1 tasks. Input coefficients of A are fed from the right and the partial remainders are pipelined to all nodes. Coefficient b_j of B is stored in task T_j. The task details for hardware systolic array implementation are shown in Fig. 18.7b, where D denotes a 1-bit register to store the intermediate results. The even-numbered tasks contain two positive edge-triggered flip-flops. On the other hand, the odd-numbered tasks contain two negative edge-triggered flip-flops.

Figure 18.7 or linear cellular automaton (LCA) processor array when s₃ = [1 −0.5], d₃ = [1 0]^t, n = 9, and m = 5. (a) The resulting tasks at each SPA stage. (b) The task workload details.

This design is usually called a linear cellular automaton (LCA) [119]. The design shown here differs from LCAs discussed in the literature in several aspects:

1. Even tasks are clocked using the clock rising-edge.

2. Odd tasks are clocked using the clock rising-edge.

3. One of the inputs is fed from a MUX.

Let q_i and r_i, 0 ≤ i < m, be the present outputs of task T_i. The next state outputs and are given by

(18.42)

(18.43)

And we identify the output and input of the LCA as

(18.44)

(18.45)

(18.46)

The above equations determine the operation of the LCA as follows:

1. Clear all the registers.

2. For time steps 0 and 2, the LFSR is working as a simple shift register moving the coefficients a₉ to a₅ between the stages.

3. At time step 4, the first quotient coefficient q₄ is obtained and is available at the next time step to the leftmost node.

4. The coefficients of Q are obtained from the leftmost node at time steps 4–9.

5. At the end of time step 9, all the remainder polynomial R coefficients are stored in the shift register stages.

6. If it is desired to shift the R coefficients out, then the feedback path must be broken to selectively disable the feedback action.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 18.8 DESIGN 3: DESIGN SPACE EXPLORATION WHEN s3 = [1 −0.5]

Create new playlist

Sign In

Sign Up

Table of Contents for
18.8 DESIGN 3: DESIGN SPACE EXPLORATION WHEN s3 = [1 −0.5]