Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Appendix D. SPU Instruction Set Reference

Chapter 15, “SPU Assembly Language,” presented SPU assembly coding in depth, but there wasn’t enough room to add details related to the timing and pipeline usage of the individual instructions. In this case, pipeline usage refers to whether the instruction is processed by the even pipeline (0) or the odd pipeline (1). This is important to know; the SPU can issue two instructions in the same cycle if they are processed by different pipelines.

This appendix lists the SPU’s instructions in alphabetic order. Each entry shows the number of clock cycles required by the instruction (latency), which pipeline it uses (0 or 1), and a description of the instruction’s purpose.

Table D.1. SPU Load/Store Instructions

Opcode	Latency	Pipeline	Purpose
`a rt,ra,rb`	2	0	Add words in `ra` and `rb`
`absdb rt,ra,rb`	4	0	Subtract bytes in `ra` from `rb`, returns absolute value
`addx rt,ra,rb`	2	0	Add words in `ra` and `rb` to LSB of `rt`
`ah rt,ra,rb`	2	0	Add halfwords in `ra` and `rb`
`ahi rt,ra,imm`	2	0	Add halfwords in `ra` and `imm` value
`ai rt,ra,imm`	2	0	Add words in `ra` to `imm` value
`and rt,ra,rb`	2	0	AND the values of `ra` and `rb`
`andbi rt,ra,imm`	2	0	AND the bytes of `ra` with the imm `value`
`andc rt,ra,rb`	2	0	AND the values of `ra` and the complement of `rb`
`andhi rt,ra,imm`	2	0	AND the halfwords of `ra` with the `imm` value
`andi rt,ra,imm`	2	0	AND the words of `ra` with the `imm` value
`avgb rt,ra,rb`	4	0	Average of bytes in `ra` and `rb`
`bg rt,ra,rb`	2	0	Generate borrow from `ra` and `rb`
`bgx rt,ra,rb`	2	0	Generate borrow from `ra, rb,` and the LSB of `rt`
`bi ra`	4	1	Branch to `ra`
`bid ra`	4	1	Branch to `ra`, disable interrupts
`bie ra`	4	1	Branch to `ra`, enable interrupts
`bihnz rt,ra`	4	1	Branch to `ra` if `rt` halfword doesn’t equal 0
`bihnzd rt,ra`	4	1	Branch to `ra` if `rt` halfword doesn’t equal 0, disable
`bihnze rt,ra`	4	1	Branch to `ra` if `rt` halfword doesn’t equal 0, enable
`bihz rt,ra`	4	1	Branch to `ra` if `rt` halfword equals 0
`bihzd rt,ra`	4	1	Branch to `ra` if `rt` halfword equals 0, disable
`bihze rt,ra`	4	1	Branch to ra if `rt` halfword equals 0, enable
`binz rt,ra`	4	1	Branch to `ra` if `rt` word doesn’t equal 0
`binzd rt,ra`	4	1	Branch to `ra` if `rt` word doesn’t equal 0, disable
`binze rt,ra`	4	1	Branch to `ra` if `rt` word doesn’t equal 0, enable
`bisl rt,ra`	4	1	Branch to `ra` and set link
`bisld rt,ra`	4	1	Branch to `ra` and set link, enable
`bisle rt,ra`	4	1	Branch to `ra` and set link, disable
`bisled rt,ra`	4	1	Branch to `ra` and set link if an event occurs
`bisledd rt,ra`	4	1	Branch to `ra` and set link if an event occurs, disable
`bislede rt,ra`	4	1	Branch to `ra` and set link if an event occurs, enable
`biz rt,ra`	4	1	Branch to `ra` if `rt` word equals 0
`bizd rt,ra`	4	1	Branch to `ra` if `rt` word equals 0, disable
`bize rt,ra`	4	1	Branch to `ra` if `rt` word equals 0, enable
`br imm`	4	1	Branch to sum of `imm` and PC
`bra imm`	4	1	Branch to the `imm` address
`brasl rt,imm`	4	1	Branch to `imm` and set link
`brhnz rt,imm`	4	1	Branch to sum of `imm` and PC if `rt` halfword doesn’t equal 0
`brhz rt,imm`	4	1	Branch to sum of `imm` and PC if `rt` halfword equals 0
`brnz rt,imm`	4	1	Branch to sum of `imm` and PC if `rt` word doesn’t equal 0
`brsl rt,imm`	4	1	Branch to sum of `imm` and PC, set link
`brz rt,imm`	4	1	Branch to sum of `imm` and PC if `rt` word equals 0
`cbd rt,index(ra)`	4	1	Create mask for byte insertion
`cbx rt,ra,rb`	4	1	Create mask for byte insertion
`cdd rt,index(ra)`	4	1	Create mask for doubleword insertion
`cdx rt,ra,rb`	4	1	Create mask for doubleword insertion
`ceq rt,ra,rb`	2	0	Compare equality of words in `ra` and `rb`
`ceqb rt,ra,rb`	2	0	Compare equality of bytes in `ra` and `rb`
`ceqbi rt,ra,imm`	2	0	Compare equality of bytes in `ra` to `imm` value
`ceqh rt,ra,rb`	2	0	Compare equality of halfwords in `ra` and `rb`
`ceqhi rt,ra,imm`	2	0	Compare equality of halfwords in `ra` to `imm` value
`ceqi rt,ra,imm`	2	0	Compare equality of words in `ra` to `imm` value
`cflts rt,ra,imm`	7	0	Convert float in `ra` to signed integer in `rt`, scaled by `imm`
`cfltu rt,ra,imm`	7	0	Convert float in `ra` to unsigned integer in `rt`, scaled by `imm`
`cg rt,ra,rb`	2	0	Generate carry vector from `ra` and `rb`
`cgt rt,ra,rb`	2	0	Return if words in `ra` are greater than words in `rb`
`cgtb rt,ra,rb`	2	0	Return if bytes in `ra` are greater than bytes in `rb`
`cgtbi rt,ra,imm`	2	0	Return if bytes in `ra` are greater than `imm`
`cgth rt,ra,rb`	2	0	Return if halfwords in `ra` are greater than `rb`
`cgthi rt,ra,imm`	2	0	Return if halfwords in `ra` are greater than `imm`
`cgti rt,ra,imm`	2	0	Return if words in `ra` are greater than `imm`
`cgx rt,ra,rb`	2	0	Generate carry vector from `ra`, `rb`, and the LSB of `rt`
`chd rt,index(ra)`	4	1	Create mask for halfword insertion
`chx rt,ra,rb`	4	1	Create mask for halfword insertion
`clgt rt,ra,rb`	2	0	Return if words in `ra` are logically greater than `rb`
`clgtb rt,ra,rb`	2	0	Return if bytes in `ra` are logically greater than `rb`
`clgtbi rt,ra,imm`	2	0	Return if bytes in `ra` are logically greater than `imm`
`clgth rt,ra,rb`	2	0	Return if halfwords in `ra` are logically greater than `rb`
`clgthi rt,ra,imm`	2	0	Return if halfwords in `ra` are logically greater than `imm`
`clgti rt,ra,imm`	2	0	Return if words in `ra` are logically greater than `imm`
`clz rt,ra`	2	0	Count 0s preceding the first 1 in `ra`
`cntb rt,ra`	4	0	Count number of 1s in each byte of `ra`
`csflt rt,ra,imm`	7	0	Convert signed integer in `ra` to float in `rt`, scaled by `imm`
`cuflt rt,ra,imm`	7	0	Convert unsigned integer in `ra` to float in `rt`, scaled by `imm`
`cwd rt,index(ra)`	4	1	Create mask for word insertion
`cwx rt,ra,rb`	4	1	Create mask for word insertion
`dfa`^[1] `rt,ra,rb`	13	0	Add double-precision values in `ra` and `rb`
`dfm1 rt,ra,rb`	13	0	Multiply double-precision values in `ra` and `rb`
`dfma1 rt,ra,rb`	13	0	Multiply double-precision values in `ra` and `rb`, add to `rt`
`dfms1 rt,ra,rb`	13	0	Multiply double-precision values in `ra` and `rb`, subtract values in `rt`
`dfnma1 rt,ra,rb`	13	0	Multiply double-precision values in `ra` and `rb`, add values in `rt`, negate result
`dfnms1 rt,ra,rb`	13	0	Multiply double-precision values in `ra` and `rb`, subtract values in `rt`, negate result
`dfs1 rt,ra,rb`	13	0	Subtract double-precision value in `ra` from `rb`
`dsync`	4	1	Ensures LS data is current before external accessing
`eqv rt,ra,rb`	2	0	Return 1 if `ra` and `rb` are the same, 0 otherwise
`fa rt,ra,rb`	6	0	Add single-precision values in `ra` and `rb`
`fceq rt,ra,rb`	2	0	Compare floating-point equality of `ra` and `rb`
`fcgt rt,ra,rb`	2	0	Return if floating-point `ra` is greater than floating-point `rb`
`fcmeq rt,ra,rb`	2	0	Compare floating-point equality of `ra` and `rb` magnitudes
`fcmgt rt,ra,rb`	2	0	Return if floating-point magnitude of `ra` is greater than that of `rb`
`fesd1 rt,ra`	13	0	Convert `float` in `ra` to double in `rt`
`fi rt,ra,rb`	7	0	Floating-point interpolate between `ra` and `rb`
`fm rt,ra,rb`	6	0	Multiply floating-point values in `ra` and `rb`
`fma rt,ra,rb,rc`	6	0	Multiply floating-point values in `ra` and `rb, add to values in rc`
`fms rt,ra,rb,rc`	6	0	Multiply floating-point values in `ra` and `rb, subtract values in rc`
`fnms rt,ra,rb,rc`	6	0	Multiply floating-point values in `ra` and `rb`, subtract values in `rc`, negate result
`frds1 rt,ra`	13	0	Round `double` in `ra` to `float` in `rt`
`frest rt,ra`	4	1	Floating-point reciprocal estimate
`frsqest rt,ra`	4	1	Floating-point reciprocal absolute square-root estimate
`fs rt,ra,rb`	6	0	Subtract floating-point values in `ra` from `rb`
`fscrrd1 rt`	13	0	Move floating point status and control register to `rt`
`fscrwr ra`	7	0	Move `rt` to floating point status and control register
`fsm rt,ra`	4	1	Form select mask for words
`fsmb rt,ra`	4	1	Form select mask for bytes
`fsmbi rt,imm`	4	1	Form select mask for bytes with `imm`
`fsmh rt,ra`	4	1	Form select mask for halfwords
`gb rt,ra`	4	1	Concatenate LSBs of each word in `ra`
`gbb rt,ra`	4	1	Concatenate LSBs of each byte in `ra`
`gbh rt,ra`	4	1	Concatenate LSBs of each halfword in `ra`
`hbr imm,ra`	15	1	Hint that the branch at `imm` will target PC + `ra`
`hbra imm1,imm2`	15	1	Hint that the branch at `imm1` will target PC + `imm2`
`hbrp`	15	1	Hint for upcoming branch, prefetch
`hbrr ra,rb`	15	1	Hint that the branch at PC + `imm1` will target PC + `imm2`
`heq ra,rb`	2	0	Halt if `ra equals rb`
`heqi ra,imm`	2	0	Halt if `ra` equals `imm`
`hgt ra,rb`	2	0	Halt if `ra` is greater than `rb`
`hgti ra,imm`	2	0	Halt if `ra` is greater than `imm`
`hlgt ra,rb`	2	0	Halt if `ra` is logically greater than `rb`
`hlgti ra,imm`	2	0	Halt if `ra` is logically greater than `imm`
`il rt,imm`	2	0	Load each word in `rt` with the `imm` value
`ila rt,imm`	2	0	Load `imm` (18-bit) into the LSBs of `rt`
`ilh rt,imm`	2	0	Load each halfword in `rt with the imm value`
`llhu rt,imm`	2	0	Load the high halfword of each word in `rt` with `imm`
`iohl rt,imm`	2	0	OR the low halfword of each word in `rt` with `imm`
`iret`	4	1	Interrupt return
`iretd`	4	1	Interrupt return, disable
`irete`	4	1	Interrupt return, enable
`lnop`	0	1	No operation (pipeline 1)
`lqa rt,lsa`	6	1	Load quadword from `lsa` to register `rt`
`lqd rt,index(ra)`	6	1	Load quadword from `ra+index` to register `rt`
`lqr rt,lsa`	6	1	Load quadword from `lsa+PC` to register `rt`
`lqx rt,ra,rb`	6	1	Load quadword from `ra+rb` to register `rt`
`mfspr rt,imm`	6	1	Move special-purpose register `imm` to `rt`
`mpy rt,ra,rb`	7	0	Multiply low halfwords in `ra` and `rb`
`mpya rt,ra,rb,rc`	7	0	Multiply signed words in `ra` and `rb`, add to `rc`
`mpyh rt,ra,rb`	7	0	Multiply high halfwords of `ra` and low hws of `rb`
`mpyhh rt,ra,rb`	7	0	Multiply high halfwords of `ra` and `rb`
`mpyhha rt,ra,rb`	7	0	Multiply high halfwords of `ra` and `rb`, and add `rt`
`mpyhhau rt,ra,rb`	7	0	Multiply unsigned high halfwords of `ra` and `rb`, and add `rt`
`mpyhhu rt,ra,rb`	7	0	Multiply unsigned high halfwords of `ra` and `rb`
`mpyi rt,ra,imm`	7	0	Multiply low halfwords in `ra` by `imm` value
`mpys rt,ra,rb`	7	0	Multiply low halfwords in `ra` and `rb` and shift right
`mpyu rt,ra,rb`	7	0	Multiply unsigned low halfwords in `ra and rb`
`mpyui rt,ra,imm`	7	0	Multiply unsigned low halfwords in `ra` by `imm` value
`mtspr imm,rt`	6	1	Move `rt` to special-purpose register `imm`
`nand rt,ra,rb`	2	0	NAND the values of `ra` and `rb`
`nop`	0	0	No operation (pipeline 0)
`nor rt,ra,rb`	2	0	NOR the values of `ra` and `rb`
`or rt,ra,rb`	2	0	OR the values of `ra` and `rb`
`orbi rt,ra,imm`	2	0	OR the bytes of `ra` with the `imm` value
`orc rt,ra,rb`	2	0	OR the values of `ra` and the complement of `rb`
`orhi rt,ra,imm`	2	0	OR the halfwords of `ra` with the `imm` value
`ori rt,ra,imm`	2	0	OR the words of `ra` with the `imm` value
`orx rt,ra`	4	1	OR the words of `ra`
`rchcnt rt,imm`	6	1	Read capacity of channel `imm` into `rt`
`rdch rt,imm`	6	1	Read data from channel `imm` into `rt`
`rot rt,ra,rb`	4	0	Rotate bits in words of `ra` left according to `rb`
`roth rt,ra,rb`	4	0	Rotate bits in halfwords of `ra` left according to `rb`
`rothi rt,ra,imm`	4	0	Rotate bits in halfwords of `ra` left according to `imm`
`rothm rt,ra,rb`	4	0	Rotate bits in halfwords of `ra` right according to -`rb`
`rothmi rt,ra,imm`	4	0	Rotate bits in halfwords of `ra` right according to `-imm`
`roti rt,ra,imm`	4	0	Rotate bits in words of `ra` left according to `imm`
`rotm rt,ra,rb`	4	0	Shift bits in words of `ra` right according to `-rb`
`rotma rt,ra,rb`	4	0	Shift bits in words of ra right according to `-rb` (algebraic)
`rotmah rt,ra,rb`	4	0	Shift bits in halfwords of `ra` right according to `-rb` (algebraic)
`rotmahi rt,ra,imm`	4	0	Shift bits in halfwords of `ra` right according to `-imm` (algebraic)
`rotmai rt,ra,imm`	4	0	Shift bits in words of `ra` right according to -`imm` (algebraic)
`rotmi rt,ra,imm`	4	0	Shift bits in words of `ra` right according to `-imm`
`rotqbi rt,ra,rb`	4	1	Rotate entire `ra` left by bits according to `rb`
`rotqbii rt,ra,imm`	4	1	Rotate entire `ra` left by bits according to `imm`
`rotqby rt,ra,rb`	4	1	Rotate entire `ra` left by bytes according to `rb`
`rotqbybi rt,ra,rb`	4	1	Rotate entire `ra` left by bytes according to `rb` count
`rotqbyi rt,ra,imm`	4	1	Rotate entire `ra` left by bytes according to `imm`
`rotqmbi rt,ra,rb`	4	1	Shift entire `ra` right by bits according to `-rb`
`rotqmbii rt,ra,imm`	4	1	Shift entire `ra` right by bits according to `-imm`
`rotqmby rt,ra,rb`	4	1	Shift entire `ra` right by bytes according to `-rb`
`rotqmbybi rt,ra,rb`	4	1	Shift entire `ra` right by bytes according to `-rb` count (algebraic)
`rotqmbyi rt,ra,imm`	4	1	Shift entire `ra` right by bytes according to `-imm`
`selb rt,ra,rb,rc`	2	0	Select bits from `ra` and `rb` according to `rc`
`sf rt,ra,rb`	2	0	Subtract words in `ra` from `rb`
`sfh rt,ra,rb`	2	0	Subtract halfwords in `ra` from `rb`
`sfhi rt,ra,imm`	2	0	Subtract halfwords in `ra` from `imm` value
`sfi rt,ra,imm`	2	0	Subtract words in `ra` from `imm` value
`sfx rt,ra,rb`	2	0	Subtract words in `ra` from `rb` and LSB of `rt`
`shl rt,ra,rb`	4	0	Shift bits in words of `ra` left according to `rb`
`shlh rt,ra,rb`	4	0	Shift bits in halfwords of `ra` left according to `rb`
`shlhi rt,ra,imm`	4	0	Shift bits in halfwords of `ra` left according to `imm`
`shli rt,ra,imm`	4	0	Shift bits in words of `ra` left according to `imm`
`shlqbi rt,ra,rb`	4	1	Shift entire `ra` left by bits according to `rb`
`shlqbii rt,ra,imm`	4	1	Shift entire `ra` left by bits according to `imm`
`shlqby rt,ra,rb`	4	1	Shift entire `ra` left by bytes according to `rb`
`shlqbybi rt,ra,rb`	4	1	Shift entire `ra` left by bytes according to `rb` count
`shlqbyi rt,ra,imm`	4	1	Shift entire `ra` left by bytes according to `imm`
`shufb rt,ra,rb,rc`	4	1	Form `rt` from the bytes of `ra` and `rb` according to `rc`
`stop`	4	1	Halt the SPU and send stop signal to PPU
`stopd`	4	1	Halt the SPU and send signal (can be used as breakpoint)
`stqa rt,lsa`	6	1	Store quadword from register `rt` to `lsa`
`stqd rt,index(ra)`	6	1	Store quadword from register `rt` to `ra+index`
`stqr rt,lsa`	6	1	Store quadword from register `rt` to `lsa+PC`
`stqx rt,ra,rb`	6	1	Store quadword from register `rt` to `ra+rb`
`sumb rt,ra,rb`	4	0	Add bytes in `ra` and `rb`, return halfword results
`sync`	4	1	Force SPU to complete all store operations before continuing
`syncc`	4	1	Force SPU to complete store and channel operations
`wrch imm,rt`	6	1	Write data from `rt` into channel `imm`
`xor rt,ra,rb`	2	0	XOR the values of `ra` and `rb`
`xorbi rt,ra,imm`	2	0	XOR the bytes of `ra` with the `imm` value
`xorhi rt,ra,imm`	2	0	XOR the halfwords of `ra` with the `imm` value
`xori rt,ra,imm`	2	0	XOR the words of `ra` with the `imm` value
`xsbh rt,ra`	2	0	Sign extend bytes in `ra` to halfwords
`xshw rt,ra`	2	0	Sign extend halfwords in `ra` to words
`xswd rt,ra`	2	0	Sign extend words in `ra` to doublewords
^[1]The double-precision math instructions (`dfa`, `dfm`, `dfma`, `dfms`, `dfnma`, `dfnms`, and `dfs`) and certain single-precision instructions (`fesd`, `frds`, and `fscrrd`) cause the pipeline to stall for six cycles. This means that after the instruction completes, the pipeline has to wait at least six cycles before issuing a new instruction.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for D. SPU Instruction Set Reference

Create new playlist

Sign In

Sign Up

Appendix D. SPU Instruction Set Reference

Table of Contents for
D. SPU Instruction Set Reference