Chapter 3 Introduction to the PIC24 Microcontroller Family

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 3

INTRODUCTION TO THE PIC24 MICROCONTROLLER FAMILY

This chapter introduces the PIC24 instruction set architecture by exploring the data memory structure and data transfer instructions of the PIC24 microcontroller (μC). The use of the MPLAB® Integrated Design Environment for assembly and simulation of PIC24 programs is also discussed.

Learning Objectives

After reading this chapter, you will be able to:

Describe the data and program memory architecture of the PIC24 μC.

Convert PIC24 instruction mnemonics to machine code and vice versa.

Describe the operation of the register direct, register indirect, immediate, file register, and default working register addressing modes.

Describe the operation of the mov, add, sub, inc, inc2, dec, dec2, and goto instructions.

Translate (manually compile) a simple C program into PIC24 assembly language.

Compute the number of clock cycles and the amount of time required to execute simple instruction sequences for the PIC24 μC.

Introduction to Microprocessors and Microcontrollers

In the previous chapter, a computer was defined as a digital system composed of control, input/output, and memory components whose operation is controlled by instructions stored in memory. The first computers were designed in the early 1940s and filled entire rooms, with total processing capability that was less than a modern digital watch. Early computers used vacuum tubes (grossly, a current amplifier within a glass tube) to implement logic, and a single logic gate could take up an entire board. Transistors were invented by Bell Labs in 1947 [1], allowing an order of magnitude reduction in the size of a logic gate implementation. However, transistors were packaged individually, and computers still required a large number of circuit boards to implement. In 1958, Jack Kilby, a researcher at Texas Instruments Inc. created the first integrated circuit [1], which is a silicon substrate upon which circuits with multiple transistors can be fabricated (the slang term chip is now commonly applied to integrated circuits). As integrated circuit fabrication techniques evolved, the size of integrated circuit transistors steadily decreased, allowing increasing numbers of transistors to be placed on the same silicon substrate. In 1971, Intel® developed a set of four integrated circuits that implemented a 4-bit computer [1]. The data paths were 4 bits, much like the number sequencing computer of Chapter 2. One chip, the 4004, implemented the instruction decode and execution (the central processing unit, or CPU), while the other chips implemented the memory and input/output. The term microprocessor (μP) was applied to this chipset, as it was a very small (micro) processing engine. The 4004 chip is generally regarded as the world’s first microprocessor. Integrated circuit technology has continually improved since the 4004, producing two distinct paths of microprocessor evolution. One evolution path has stressed high performance, using the increasing number of transistors to build larger internal data paths (up to 64 bits) and registers, advanced numerical processing, and support for very large memory spaces. These microprocessors are referred to as general purpose microprocessors and expect programs and data to be stored in memory external to the microprocessor. General-purpose microprocessors require external support chips, known as a chipset, that allow them to interface with memory and input/output devices. Examples of general purpose microprocessors are the Intel Core™ i3/i5/i7, the Advanced Micro Devices FX, and the IBM PowerPC® families. Because of shrinking transistor size, it is now common to integrate multiple high performance CPUs on the same chip, as evidenced by the Intel Core™ i7 Extreme processor with six cores and the AMD FX processor with eight cores.

The other microprocessor evolution path has stressed higher integration and lower cost, with the goal of producing a single-chip solution to problems requiring a stored program machine approach. The term microcontroller (μC) is generally applied to these devices. A microcontroller typically expects its program and data to be stored on-chip, with any logic required for external input/output devices also integrated into the same device. Thus, a microcontroller implements all of the components of a computer—control, memory, and input/output—in one chip. Microcontroller solutions are usually very cost sensitive, so applying exactly the right amount of processing power to a problem to minimize cost is important. As such, a large number of microcontroller families are available from 8 bits to 32 bits, with widely varying amounts of on-chip memory and different combinations of input/output interface options (whose number and variety grow each year).

Microcontroller versions of general purpose microprocessors have also been introduced over the years, so the distinction between a microcontroller and a microprocessor has become somewhat blurred and in some cases is an arbitrary labeling. In this book, the term microcontroller is applied to devices in the PIC24 family, but the term microprocessor is used any time a more general labeling is desired.

The PIC24 Microcontroller Family

The PIC24 microcontroller family, created by Microchip Technology Inc., is the focus of this book for discussing microprocessor programming, architecture, and interfacing topics. At the time of this writing, the branches of the PIC24 family tree are the PIC24E, PIC24H, and the PIC24F. The PIC24E is the latest branch and is the highest speed with a maximum execution speed of 70 million instructions per second (MIPS) at 3.3V. The PIC24H family was the first “high-speed” branch and has a maximum execution speed of 40 MIPS. The PIC24F family is a lower-speed/lower-power variant that has a wide voltage operating range of 2.0 V to 3.6 V with a maximum execution speed of 16 MIPS. There are many versions of PIC24E/H/F microcontrollers available that differ in on-chip memory size, number of I/O pins, and on-chip peripheral features. However, all of the PIC24 family members use the same instruction set architecture (ISA). In fact, the PIC24 instruction set architecture is a subset of the dsPIC30/dsPIC33F/E ISA, another 16-bit microcontroller family from Microchip. The ds in dsPIC® stands for Digital Signal Processing (DSP), and dsPIC® microcontrollers have extra instructions implemented by an on-chip DSP engine that accelerate the types of computations found in DSP applications. The PIC24E and dsPIC33E families have the same on-chip peripherals and only differ in that the PIC24E does not have the DSP engine and thus does not support the additional DSP-oriented instructions. This book does not cover the extra instructions implemented by the DSP engine. In the hardware chapters of this book, the PIC24 microcontroller variant used for example purposes is the dsPIC33EPXXXGP502, which is pin compatible with the PIC24EPXXXGP202. The XXX indicates the program memory size in kilobinary bytes, with available sizes of 32, 64, 128, 256, and 512. The GP stands for general purpose. The last number of 202 (PIC24) and 502 (dsPIC33) indicates the packaging, which is a dual-in-line (DIP) 28-pin package in this case. When a specific example device is needed, the book uses the dsPIC33EP128GP502 microcontroller, but refers to all the dsPIC30/33 and PIC24 families as a PIC24 microcontroller. Your introduction to the PIC24 family begins with the instruction set architecture and a good place to start is with a simplified block diagram of the processor core, as shown in Figure 3.1.

Figure 3.1
PIC24 CPU core simplified architectural diagram (this is a simplified view of Figure 3.1 found in the PIC24EPXXXGP datasheet (DS70657G), Microchip Technology Inc.)

Source: Adapted with permission of the copyright owner, Microchip Technology, Incorporated.
All rights reserved. No further reprints or reproduction may be made without Microchip Inc.’s prior written consent.

The size of the internal data paths of the PIC24 CPU is 16 bits, so it is referred to as a 16-bit microcontroller. This means that the natural size for computations is 16 bits; arithmetic operations such as additions and subtractions that operate on 16-bit data or 8-bit data can be specified with one instruction. Operations on data larger than 16 bits can be performed but require multiple instructions to accomplish. The PIC24 instruction set defines 71 distinct instructions, of which the majority require 24 bits (3 bytes) to encode. The term instruction word is used to refer to a 24-bit machine code value. Two instruction types require two instruction words (6 bytes) to encode. The instruction register in Figure 3.1 contains the instruction word that is currently being executed. The arithmetic/logic unit (ALU) in Figure 3.1 is the combinational logic that performs operations such as addition, subtraction, bitwise AND/OR/XOR, and so forth. The left-side ALU input is from a register bank of 16 working registers named W0-W15, each of which is 16 bits wide. The right-side ALU input is connected to the X-data bus, which means that this operand can be from many different sources, with the most common sources being data memory or a working register. The ALU output is tied to the X-data bus, with the most common destination being data memory or a working register.

Program Memory Organization

Figure 3.1 shows that the PIC24 μC has separate memories for program instructions and data. This type of arrangement is known as a Harvard architecture, as early electromechanical calculators such as the Harvard Mark I [1] read instructions from punched tape, with memory used only for storing data. Most microprocessors store programs and data in the same memory, which means that instructions can access memory that contains instructions as easily as locations that contain data. The majority of PIC24 instructions can only access data memory; a few special mechanisms are provided enabling instructions to access program memory. The program counter (PC) is a 23-bit register that provides the address for accessing program memory for instruction fetch. Figure 3.2 illustrates how the PC value is used to access an instruction word in program memory. Even though an instruction word is 24 bits (3 bytes), it is actually more convenient to think of an instruction word as being 32 bits that consists of a least significant word (LSW) and most significant word (MSW), each 16 bits wide. The upper 8 bits of the most significant word are unimplemented and read as zeros; this is the phantom byte shown in Figure 3.2. In this model, program memory is addressable in 16-bit words, and the PC always points to the beginning of an instruction word, or the least significant word. This means that the least significant bit of the PC is always 0, and that the upper 22 bits of the PC contain the address of one instruction word out of a possible 2²² = 4 Mi instruction locations. At the time of the writing of this text, no PIC24 microcontroller physically implemented the full 4 Mi instruction space supported by the architecture definition. Circa 2014, PIC24 family members offer a wide range of physical memory sizes ranging from 6 Ki instructions up to approximately 170 Ki instructions, with new variants being regularly introduced.

Program memory is nonvolatile, meaning the memory contents are retained when power is removed. Some form of nonvolatile memory is required for any practical computer system, as this provides the instructions that are executed when power is applied. The PIC24 program memory is flash programmable, meaning that it can be electrically erased and programmed. Other types of nonvolatile memory are one-time-programmable (OTP), meaning it cannot be erased once programmed, and mask-programmed read-only memory (ROM), which means the memory contents are determined at memory manufacture time and cannot be changed. Read operations on flash memory are fast, in the tens of nanoseconds, but write operations require more time with a minimum of 20 μs required to write one program word.

Figure 3.2
Program memory addressing (redrawn by author from Figure 4.6 found in the PIC24EPXXXGP datasheet (DS70657G), Microchip Technology Inc.)

Data Memory Organization

Figure 3.3 shows that the data memory address space is 64 KiB total, although additional memory may be accessed via a paging scheme not covered in this text. Data memory is organized as 16-bit words, with the lower 8 bits referred to as the least significant byte (LSB) and the upper 8 bits as the most significant byte (MSB). Observe that the LSB and MSB acronyms use a capital B in reference to byte; the acronyms LSb and MSb are reserved for least significant bit and most significant bit, respectively. Data memory is byte addressable, which means that the MSB and LSB portions of a 16-bit word can be modified individually. LSB addresses are at even locations, while MSB addresses are at odd locations. Data memory is volatile, meaning that its contents are lost if power is removed. Data memory is divided into the following sections:

Special function registers. In the PIC24E/dsPIC33E family, the first 4096 locations of data memory (0x0000 to 0x0FFF) are reserved for special function registers (SFRs). An SFR is associated with some specific functionality implemented in the PIC24 μC. From an assembly language viewpoint, SFRs are treated in the same manner as a non-SFR data memory location, but their actual physical implementation is much different and depends upon the particular SFR. Most SFRs serve as data and control registers for on-chip peripherals such as the timers, the analog-to-digital converter, various serial interfaces, and so on. The W0-W15 working registers are SFRs, as well as the Program Counter. Even though each SFR is assigned a specific address, it is more common to refer to an SFR by its name rather than its location.

Data RAM. Data memory locations that are not SFRs are referred to by various names such as file registers, data RAM, and X data. Data RAM is used for variable storage by user programs, but is also used as a side effect of instruction execution in some cases (i.e., the storage of a return address by a subroutine call, covered in Chapter 6). As with program memory, the amount of data RAM physically implemented is dependent upon the particular PIC24 μC. However, all PIC24 μCs have some amount of data RAM, and this begins at location 0x1000 for PIC24E/dsPIC33E devices.

Near data memory. The first 8,192 locations in data memory are known as near RAM, and certain instruction forms can only access data memory locations in near RAM. Many PIC24 μC variants have physical data RAM that does not exceed the boundaries of near data space, and thus this distinction is not important in those devices. The assembly language examples in this text generally assume that all data RAM is located in near RAM. The dsPIC33EP128GP502 μC used in the hardware chapters of this text implements 16,384 bytes of data RAM, and thus a portion of its data RAM extends past near RAM. While 16,384 bytes of RAM seems tiny compared to the millions (and in many implementations, billions) of RAM bytes in a personal computer that contains a general purpose microprocessor, it is actually a generous data memory size for microcontrollers.

Program space visibility area. The upper 32 Ki of data memory is also known as the program space visibility (PSV) area, which allows a user program to access non-volatile data stored in program memory. The PSV is discussed in more detail in Chapter 6.

DMA RAM. Some PIC24 μCs have a portion of data RAM implemented as dual-port, static (DPS)RAM. The DPSRAM and associated DMA peripheral block allow some on-chip peripherals such as the analog-to-digital converter to transfer data to DMA RAM without intervention by the CPU core (i.e., without executing assembly language instruction for performing this data movement) and without slowing the CPU’s execution. Chapter 13 contains further discussion of DPSRAM and its usage.

The sizes of program and data memories for some PIC24EP variants are listed in Table 3.1.

Table 3.1: Some PIC24EP Microcontroller Memory Sizes

Figure 3.3
Data memory organization for the PIC24EP128GP (redrawn by author from Figure 4.14 found in the PIC24EPXXXGP datasheet (DS70657G), Microchip Technology Inc.)

Arrangement of Multibyte Values in Data Memory

This book’s assembly language and C programs use 8-bit, 16-bit, and 32-bit data sizes. Table 3.2 shows the unsigned ranges for these data sizes; Chapter 5 discusses signed data ranges for these same data sizes. As previously discussed, the lower byte of a 16-bit or 32-bit value is referred to as the least significant byte (LSB), while the upper byte is referred to as the most significant byte (MSB). Using a 16-bit or 32-bit variable instead of an 8-bit variable allows a larger number range to be represented, at the cost of requiring more data memory bytes.

When a 16-bit or 32-bit value is stored in data memory in the PIC24 μC, the bytes are arranged in least significant byte to most significant byte order in increasing memory locations. This arrangement is known as little-endian byte order, and it is in common use in many microprocessor implementations.

Table 3.2: Unsigned Data Ranges

Some microprocessor families use the reverse arrangement of most significant byte to least significant byte in increasing memory locations, which is referred to as big-endian byte order. There is no inherent advantage to little-endian or big-endian byte order (computer architects have adopted these terms in a tongue-in-cheek reference to the book Gulliver’s Travels published by Jonathon Swift in 1726, in which Lilliputians and Blefuscans disagree over which end of the egg to crack first, the little end or the big end, when enjoying boiled eggs). The microprocessor architects determine the byte ordering choice during the microprocessor’s design phase. Figure 3.4 illustrates the storage of a 16-bit number and a 32-bit number in memory, with the memory shown in both 8-bit and 16-bit widths for additional clarity. Memory diagrams in this book always show memory addresses increasing as you progress down the page, and generally data memory is shown as 16 bits wide. The use of 32-bit data in assembly language programs is covered in Chapter 5.

Figure 3.4
Storage of multibyte data

A requirement of the PIC24 μC architecture is that 16-bit and 32-bit data must have the LSB start at an even location (i.e., be word aligned) if the data are to be accessed by PIC24 assembly language instructions in 16-bit chunks (also known as word mode). It is the responsibility of the assembly language programmer to ensure that 16-bit and 32-bit values are word-aligned in memory. A word access to an odd address is known as a misaligned access, which generates an exception that triggers execution of an exception handler, whose default action is to reset the processor (exceptions are covered in Chapter 9). A misaligned write operation does not alter data memory contents. The assembly language and C programs in this book assume that 16-bit and 32-bit values are correctly word-aligned in data memory. When writing assembly language programs by hand, it is often necessary to insert extra bytes in order to maintain this alignment.

Data Transfer Instructions and Addressing Modes

One fundamental instruction class found within any microprocessor is the data transfer class. Data transfer instructions copy data between registers and locations in data memory, or write a value stored in the instruction word into data memory or registers. In most microprocessors, registers are separate from data memory, and instructions contain addresses for both registers and memory. However, special function registers on the PIC24EP μC are simply data memory locations with addresses of 0x0000-0x0FFF, making the terms register and data memory location interchangeable. This is why data memory locations are sometimes referred to as file registers in the Microchip PIC24 μC documentation.

The execution of data transfer instructions between data memory locations is described as:

(src) → dst “copy the content of source location src to destination location dst.”

The → symbol is a transfer symbol, and (src) → dst is called register transfer notation, which is used to symbolically describe instruction execution. The parentheses () on the left side is read as “content of,” and indicates that you are transferring the content of a source memory location or register to a destination memory location or register. This is a common notation used to describe microprocessor instruction actions, and both register transfer notation and word descriptions are used in this book for discussing instruction execution.

Register Direct Addressing

For the PIC24 μC, instructions that copy data between data memory locations are called move instructions and the assembly language mnemonic is mov. This is somewhat unfortunate, as the word move implies removing data from one location and placing it in another location, but mov instructions do not affect the content of the source memory location. A mov instruction has two operands, source and destination, as noted earlier for data transfer operations. The two operands specify the addresses of the data to be moved. The method by which the address is specified is called the addressing mode. The simplest addressing mode is register direct, which specifies one of the 16 working registers. The following mov instruction copies the content of register W3 to W5:

mov   W3, W5       ; copy the content of W3 to W5

Observe that for the PIC24 mov instruction, the first register is the source and the second register is the destination; other microcontroller instruction sets may reverse this order. In this book, lowercase is generally used for assembly language mnemonics and uppercase for register names (instruction mnemonics and working register names are actually case insensitive, so upper/lower case is used for emphasis only). The mov instruction has other addressing modes that are available for both source and destination operands, namely immediate, file register and six types of indirect addressing. This chapter provides examples of register direct, immediate, and file register addressing. One form of indirect addressing is discussed as well, with detailed coverage of the various indirect addressing modes reserved for Chapter 6. Because a register placeholder in an instruction format can be one of several different addressing modes, the following register placeholder symbols are used when discussing instruction formats:

Wn: Register direct addressing; Wn specifies one of W0, W1, ... W15.

Wns: Register direct addressing; Wns specifies one of W0, W1, ... W15.

Wnd: Register direct addressing; Wnd specifies one of W0, W1, ... W15.

Wb: Register direct addressing; Wb specifies one of W0, W1, ... W15.

WREG: The working register; specifies W0 in file register instructions.

Ws: Register direct (Ws) and indirect addressing modes ([Ws], [Ws ++], [Ws – –], [++Ws], [– –Ws]); Ws specifies one of W0, W1, ... W15.

Wd: Register direct (Wd) and indirect addressing modes ([Wd], [Wd ++], [Wd – –], [++Wd], [– –Wd]); Wd specifies one of W0, W1, ... W15.

Wso: All of the addressing modes of Ws, with the additional mode of register offset indirect [Wso + Wb]; Wso specifies one of W0, W1, ... W15.

Wdo: All of the addressing modes of Wd, with the additional mode of register offset indirect [Wdo + Wb]; Wdo specifies one of W0, W1, ... W15.

For example, the general form of the previous move instruction is mov Wso, Wdo with each operand able to use seven different addressing modes. Examples are:

mov W2, W3            ; both source and destination use reg. direct
mov [W2], [W3 + W7]   ; both source and destination use reg. indirect
mov [--W4], W5        ; source uses reg. indirect, dest. uses reg. direct

As previously mentioned, only one indirect addressing mode is discussed in this chapter; the remaining modes are reserved for Chapter 6.

The previous mov instruction operates in word mode; that is, the 16 bits in W3 are copied to W5. Word mode operation is the default for instructions, and the mnemonic can be written with a .w extension if desired:

mov.w  W3, W5       ; copy the content of W3 to W5

A byte mode is also available, and must be specified with a .b extension as follows:

mov.b  W3, W5      ; copy the content of W3.LSB to W5.LSB

In this case, only the LSB of W3 is copied to the LSB of W5; the MSB of W5 is unaffected. This book uses the notation Wn.LSB and Wn.MSB to refer to the least significant and most significant bytes of register Wn, respectively. Figure 3.5 shows another example of a mov using register direct addressing.

Figure 3.5
mov using register direct addressing

Figure 3.6(a) gives the instruction format for the mov.{b} Wso, Wdo instruction. Observe that the ssss bit field in the instruction format specifies the Wso register number, while the dddd bit field specifies the Wdo register number. The ggg and hhh fields specify the addressing modes of the source and destination operands, respectively. The wwww field is only used if a particular form of indirect addressing is used (register offset indirect). Figure 3.6(b) shows the conversion of the mov W3, W5 instruction to its machine code representation of 0x780283. The 3-bit code for register direct addressing is 0b000 (see Appendix A), which is used for the ggg, hhh fields since both the source and destination of the mov W3, W5 instruction uses register direct addressing. The wwww field is unused and can be set to 0b000 because register direct addressing, rather than indirect addressing, is used. The machine code for the mov.b W3, W5 instruction is given in Figure 3.6(c) with the B-field bit (byte versus word operation) being the only difference from Figure 3.6(b).

This text does not attempt to discuss the instruction formats of every PIC24 instruction, but instead covers the formats of a few select instructions from the different classes of PIC24 instructions. A table of the machine code formats for all PIC24 instructions can be found in Appendix A.

Figure 3.6
Instruction format: mov{.b} Wso, Wdo

File Register Addressing

File register addressing specifies a data RAM address encoded as part of the instruction machine code (in other microprocessor families, this type of addressing is called memory direct addressing, since the memory address is directly specified in the instruction word). Figure 3.7(a) shows the instruction format for a mov instruction whose source operand uses register direct addressing and whose destination operation uses file register addressing. This mov form only operates in word mode, so the file register address must be an even address to satisfy the word-alignment rules previously discussed. Because the file register address is word-aligned, its lower bit must be zero, so only the upper 15 bits of the 16-bit address are encoded in the instruction. The conversion of mov W3, 0x1002 to machine code is given in Figure 3.7(b).

Figure 3.8 shows the operation of a mov W3, 0x1002 instruction in word mode, causing the bytes in locations 0x1002 and 0x1003 to become copies of W3.LSB and W3.MSB, respectively.

Figure 3.7
Instruction format: mov Wns, f

Figure 3.8
Example operation: mov Wn, f

Given that there is a method for copying data from a working register to data RAM, it follows that the converse must be possible. Figure 3.9 shows the execution of the mov 0x1002, W3 instruction, which copies the content of data RAM location 0x1002 to W3. The general form of this mov form is mov f, Wn and it also can only function in word mode. See Appendix A for the machine code format of this mov type.

Figure 3.9
Example operation: mov f, Wn

WREG—The Default Working Register

The astute reader may be wondering at this point how to transfer a byte between a working register and data RAM, since the mov Wn, f and mov f, Wn instructions only operate in word mode. Byte transfers between working registers and data memory are accomplished by either using the indirect addressing modes (Chapter 6) or by using a mov instruction form that uses the default working register. The mov{.b} WREG, f instruction format is shown in Figure 3.10. There are two important points about this instruction format. First, the default working register (WREG) is W0. Second, because only 13 bits are reserved for the data memory address, f has to be in the first 8192 locations (2¹³ = 8192) of data memory. As previously discussed, this area of data RAM is known as near RAM. Any PIC24 instruction that uses WREG and a file register address has this near RAM restriction.

Figure 3.11 illustrates a byte transfer from the LSB of W0 to memory location 0x1001 using the mov.b WREG, 0x1001 instruction. While WREG is synonymous with W0 in terms of instruction execution, it is a syntax error to write this instruction form as mov.b W0, 0x1001.

Figure 3.10
Instruction format: mov{.b} WREG, f

Figure 3.11
Example operation: mov{.b} WREG, f

To transfer the LSB of any other working register besides W0 to data RAM, execute a two-instruction sequence:

mov.b  Wn, W0       ; move the Wn.LSB to W0.LSB
mov.b  WREG, f      ; move W0.LSB to data memory

Figure 3.12 shows the format of the mov{.b} f {, WREG} instruction, which copies from a data memory location to either the WREG or back to itself. Observe that the D bit in the machine code selects between a destination of WREG (D bit = 0) or f (D bit = 1). At this point it may seem pointless to copy a data RAM value back to itself, but the usefulness of this operation is discussed in Chapter 4. Figure 3.13 shows an example operation of mov{.b} f {, WREG} for the case of mov.b 0x1000, WREG.

Figure 3.12
Instruction format: mov{.b} f {, WREG}

Figure 3.13
Example operation: mov{.b} f {, WREG}

Immediate Addressing

The immediate addressing mode encodes a literal (a constant) in the instruction word. Figure 3.14 shows the instruction format for immediate addressing used in a mov instruction to load a working register with a constant. The constant is a 16-bit value if a word operation is performed and an 8-bit value if a byte operation is done. The instruction encoding is such that these are actually two different opcodes with the instruction encoding that is used dependent on the literal size.

Figure 3.14
Instruction format: mov #lit16, Wn; mov.b #lit8, Wn

It is common for a reader new to assembly language programming to confuse immediate addressing with file register addressing. The difference between these two addressing modes is illustrated in Figure 3.15. In Figure 3.15(a), the mov #0x1000, W2 instruction (immediate addressing) copies the value 0x1000 into register W2. Contrast this with Figure 3.15(b), where the mov 0x1000, W2 instruction copies the word located at data RAM location 0x1000 into W2, resulting in a modified W2 value of 0x8B1A. The # symbol used with 0x1000 differentiates immediate addressing from file register addressing. Figure 3.15(c) illustrates execution of a mov instruction that uses an 8-bit literal; observe that only the LSB of W2 is affected. If a word operation is used in Figure 3.15(c), then all 16-bits of W2 are written.

Figure 3.15
Instruction format: mov{.b} #lit, Wn

Indirect Addressing

A mov that uses one of the many forms of indirect addressing for both source and destination operands is:

mov [Wns], [Wnd] ; ((Wns)) → (Wnd)

Compare this with a mov that uses register direct addressing for both operands:

mov Wns, Wnd  ; (Wns) → Wnd

Note that the operands for the indirect addressing example use brackets [] around the Wns and Wnd operands, indicating an indirect addressing mode. The register transfer description for the indirect example has an extra set of parentheses () around both the Wns and Wnd operands. The ((Wns)) → (Wnd) register transfer operation means the following:

The content of Wns, also written as (Wns), is an address called the source effective address, abbreviated as EAs. The content of the source effective address, written as (EAs), is the source operand. Thus, the source operand is fully specified as ((Wns)).

The content of Wnd, also written as (Wnd), is an address called the destination effective address, abbreviated as EAd. The mov copies the source operand to the memory location specified by the destination effective address, or in register transfer notation: (EAs) → EAd or ((Wns)) →(Wnd).

A mov instruction that uses indirect addressing for one or both of its operands has an instruction format much like the mov instructions that use immediate or direct addressing modes discussed earlier. The details of the instruction format for indirect addressing mov instruction are left as an exercise for you.

Figure 3.16 shows three mov instruction examples. Figure 3.16(a) shows a mov W0, W1 instruction; note that this uses register direct addressing for both source and destination operands. The end result of Figure 3.16(a) is that the content of W0 is copied to W1, modifying W1 to contain 0x1000. Figure 3.16(b) shows a mov [W0], [W1] instruction that uses register indirect addressing for both source and destination operands. The source operand’s effective address is the content of W0, or 0x1000. The destination effective address is the content of W1, or 0x1002. Thus, the mov’s execution copies the content of location 0x1000 to location 0x1002, modifying location 0x1002 to contain 0x8BFA. The final example in Figure 3.16(c) shows a mov W0, [W1] instruction, illustrating that different addressing modes can be used for the source (register direct) and destination (register indirect) operands. The mov instruction’s execution copies the content of W0 to the destination effective address, which is the content of W1 (0x1002 as seen in the figure), thus modifying location 0x1002 to contain 0x1000.

There are other types of indirect addressing modes; these are reserved for Chapter 6 as indirect addressing is better appreciated when discussed in the context of pointer operations in the C language.

Figure 3.16
Example operation: mov.{b} [Wns], [Wnd]

Instruction Set Regularity

At this point you might be somewhat dizzy from the number of different mov types that have been covered and also fearful of the remaining 70 instructions yet to be discussed. In an attempt to put your mind at ease, we assert that you will soon discover that there is a regularity to the PIC24 instruction set architecture. This means that once the rules are understood for one instruction class, the same rules apply to other instructions in the same class. The number of instruction classes is small, so there are not very many different rule sets that must be understood. Rest assured that the task is not as difficult as it may seem at this particular point!

Sample Question: Assume that registers and memory contain the values of Table 3.3 at the start of each of the following instructions. Give the register or memory location that is modified by the instruction and its new value.

Table 3.3: Register Memory Contents

1. mov W2, W0

2. mov 0x1000, W0

3. mov #0x1000, W0

4. mov W2, [W0]

5. mov.b W2, W0

6. mov W2, [W1]

7. mov 0x1000, WREG

8. mov 0x1000

Answers:

1. This performs (W2) → W0, or 0x8345 → W0, so W0 is modified to 0x8345.

2. This performs (0x1000) → W0, or 0x382A → W0, so W0 is modified to 0x382A.

3. This performs 0x1000 → W0, so W0 is modified to 0x1000.

4. This performs (W2) → (W0), or 0x8345 → 0x1000, so location 0x1000 is modified to 0x8345.

5. This performs (W2.LSB) → W0.LSB, so W0 is modified to 0x1045.

6. This performs (W2) → (W1), or 0x8345 → 0x1003, which is an illegal operation because a word operation is being done to an odd address. Location 0x1003 is not modified, and a misaligned address exception is generated (Chapter 9).

7. This performs (0x1000) → WREG, or 0x382A → W0, so W0 is modified to 0x382A.

8. This performs (0x1000) → 0x1000, or 0x382A → 0x1000, location 0x1000 is left unchanged as 0x382A.

Basic Arithmetic and Control Instructions

Two other classes of microprocessor instructions are arithmetic and control classes. All microprocessors have arithmetic instructions such as addition, subtraction, increment (+1), and decrement (–1) instructions. Some microprocessors, such as the PIC24, also have instructions for multiplication and division; these can be implemented using addition, subtraction, and shifts if dedicated multiplication and division instructions are not present.

Three-Operand Addition/Subtraction

The PIC24 supports both a three-operand form and a two-operand form for addition and subtraction operations. The first three-operand forms to be discussed are:

add{.b} Wb, Ws,  Wd     ; (Wb) + (Ws) → Wd
sub{.b} Wb, Ws,  Wd     ; (Wb) – (Ws) → Wd
subr{.b} Wb, Ws, Wd     ; (Ws) – (Wb) → Wd

Recall that the Ws, Wd symbols allow both register direct and register indirect addressing modes (with the exception of register offset indirect). The Wb operand is limited to register direct addressing. Examples of different addressing mode forms for these instructions are:

add W2, [W8], W3           ; reg. indirect for Ws, reg. direct for Wd
sub.b W3, W4, W5           ; reg. direct for Ws, Wd
subr W9, [W4++], [  W8]    ; these indirect modes are covered in Chapter 6

The advantage of a three-operand addition or subtraction is that the values of the two source operands can be preserved if desired. Figure 3.17 shows three addition examples using register direct addressing. Figure 3.17(a) illustrates execution of the add W0, W1, W2 instruction, while Figure 3.17(b) shows the same operation in byte mode. Figure 3.17(c) demonstrates that the three operands can be any of the working registers, including having all three operands from the same register.

Three subtraction examples are given in Figure 3.18. Because subtraction is not commutative, the instructions sub W0, W1, W2 and sub W1, W0, W2 are not mathematically equivalent. When using a subtraction instruction, you must remember that sub Wb, Ws, Wd implements (Wb) - (Ws) and not vice versa, as shown in Figures 3.18(a) and 3.18(b). Byte mode operation for subtraction is shown in Figure 3.18(c). The subr instruction forms perform the reverse subtraction operation, so subr Wb, Ws, Wd implements (Ws) - (Wb).

The three-operand forms for addition/subtraction using immediate addressing are:

add{.b} Wb, #lit5,  Wd     ; (Wb) + #lit5 → Wd
sub{.b} Wb, #lit5,  Wd     ; (Wb) – #lit5 → Wd
subr{.b} Wb, #lit5, Wd     ; #lit5 – (Wb) → Wd

Figure 3.17
Instruction execution: add{.b} Wb, Ws, Wd

Figure 3.18
Instruction execution: sub{.b} Wb, Ws, Wd

The #lit5 operand is a 5-bit unsigned literal, meaning the number range that can be encoded is 0 to 31. When this 5-bit unsigned literal is used in an operation with an 8-bit or 16-bit operand, its upper bits are filled with zeros to match the length of the other operand. This is known as zero extension, and this is discussed in more detail in Chapter 4. The previous addition and subtraction instructions are limited to a 5-bit literal because of the number of available bits in the 24-bit instruction word after the base register Wb and destination register Wd locations are encoded. Indirect addressing modes can be used with the Wd operand, while the Wb operand is limited to register direct addressing. These instructions form a convenient mechanism for adding or subtracting a small constant using a single instruction. A larger constant can be added or subtracted by the two-instruction sequence:

mov #lit16, Wx        ; #lit16 → Wx
add/sub  Wb, Wx, Wd  ; Wb  +/- Wx → Wd

Two-Operand Addition/Subtraction

The two-operand addition/subtraction forms use file register addressing and the default working registers as follows:

add{.b}  f           ; (f) + (WREG) → f
add{.b}  f, WREG     ; (f) + (WREG) → WREG
sub{.b}  f           ; (f) – (WREG) → f
sub{.b} f, WREG      ; (f) – (WREG) → WREG
subr{.b} f           ;(WREG) – (f) → f
subr{.b} f, WREG     ;(WREG) – (f) → WREG

As with the mov form that used these addressing modes, the data RAM address f must be in near RAM (the first 8192 locations). The advantage of the two-operand addition/subtraction forms is that a value can be added to a data memory location without first copying that location’s value into a working register. The disadvantages of the two-operand addition/subtraction forms are that one of the source operands must be overwritten (destroyed) and the restriction to near memory for the data memory address.

Two-operand addition/subtraction using immediate addressing and register direct addressing are:

add{.b} #lit10, Wn     ; (Wn) + #lit10 → Wn
sub{.b} #lit10, Wn     ; (Wn) – #lit10 → Wn

The #lit10 operand is a 10-bit unsigned literal whose range is 0 to 1023 for word operations and 0 to 255 for byte operations. These instructions forms are limited to register direct addressing for the Wn operand; there is not a subr form available.

Increment, Decrement Instructions

Various increment and decrement instructions are provided; the first forms to be discussed are:

inc{.b} Ws,  Wd     ;(Ws) + 1 → Wd
inc2{.b} Ws, Wd     ;(Ws) + 2 → Wd
dec{.b} Ws,  Wd     ;(Ws) – 1 →  Wd
dec2{.b} Ws, Wd     ;(Ws) – 2 →  Wd

These increment/decrement instructions use register direct and register indirect addressing modes for the source and destination operands. The inc2/dec2 instructions provide increment/decrement by 2 instead of by 1, which is useful when you’re incrementing a register containing a data memory address and want the address to be incremented to the next 16-bit location. You might wonder why these instructions are included when it appears that the same functionality can be accomplished by use of the add/sub instructions using the short literal (#lit5). The answer is that the indirect modes available for the source operand (Ws) in the increment/decrement instructions provide more flexibility than add/sub instructions with a short literal. If the source operand uses register direct addressing, then there is no advantage either way in using the increment/decrement instructions versus using the add/sub instruction with a short literal.

Some additional forms of increment/decrement that use file register direct and the working register are:

inc{.b} f              ;(f) + 1 → f
inc{.b} f, WREG        ;(f) + 1 → WREG
inc2{.b} f             ;(f) + 2 → f
inc2{.b} f, WREG       ;(f) + 2 → WREG
dec{.b} f              ;(f) – 1 → f
dec{.b} f, WREG        ;(f) – 1 → WREG
dec2{.b} f             ;(f) – 2 → f
dec2{.b} f, WREG       ;(f) – 2 → WREG

As previously seen with the mov instruction using this addressing mode, f is limited to near RAM, the lowest 8192 locations in data memory.

Program Control: goto

Control instructions affect the program counter content, with the simplest form being a goto, an unconditional transfer of control. Figure 3.19 shows the machine code format for the goto instruction.

The nnn ... nn bits encoded in the goto machine code shown in Figure 3.19 specify the program memory location that control is transferred to, which is known as the target address. The nnn ... nn bits form a 22-bit value that specifies one of the 4 Mi (2²²) possible instruction word locations; this 22-bit value is concatenated with a 0 bit to form a 23-bit value that is loaded into the 23-bit program counter.

Figure 3.19
Instruction format: goto Expr

The goto instruction requires two instruction words (6 bytes), with the second instruction word containing bits 22 through 16 of the target address and the first instruction word containing the remaining bits of the target address. There is only one other instruction in the PIC24 instruction set that requires two instruction words to encode (a call instruction) and it also specifies a target address in the program address space. The call instruction is also in the control class; and its usage is covered in Chapter 6.

Sample Question: Assume that registers and memory contain the values of Table 3.4 at the start of each of the following instructions. Give the register or memory location that is modified by the instruction and its new value.

Table 3.4: Register Memory Contents

1. add W2, W1, W0

2. add 0x1000, WREG

3. sub.b W0, #5, W1

4. inc [W0], W1

5. dec.b W0, [W1]

6. sub W2, [W0], [W0]

7. dec2.b W2, W1

8. inc 0x1004

Answers:

1. This performs (W2) + (W1) → w0, or 0x8345 + 0x1003 → w0, so w0 is modified to 0x9348.

2. This performs (0x1000) + w0 → w0, or 0x382A + 0x1000 → w0, so w0 is modified to 0x482A.

3. This performs (w0.LSB) – 5 → W1.LSB, or 0x00 – 5 → W1.LSB, so W1 is modified to 0x10FB.

4. This performs ((w0)) + 1 → W1, or (0x1000) + 1 → W1, or 0x382A + 1 → W1, so location W1 is modified to 0x382B.

5. This performs (w0.LSB) – 1 → (W1), or 0x00 – 1 → 0x1003, so 0x1003 is modified to 0xFF. The word at location 0x1002 now reads as 0xFF80.

6. This performs (W2) – ((w0)) → (w0), or 0x8345 – (0x1000) → 0x1000, or 0x8345 – 0x382A → 0x1000, so location 0x1000 is modified to 0x4B1B.

7. This performs (W2.LSB) – 2 → W1.LSB, or 0x45 – 2 → W1.LSB, so W1 is modified to 0x1043.

8. This performs (0x1004) + 1 → 0x1004, or 0xFFFF + 1 → 0x1004, location 0x1004 is modified to 0x0000 (0xFFFF + 1 wraps back to 0x0000).

A PIC24 Assembly Language Program

At this point, you have more than enough instructions to write a simple PIC24 assembly language program. In this book, programs are first written in C then translated (compiled) to assembly language. This is done for the instructional purpose of illustrating the linkage between high-level programming language statements and their implementation in assembly language. Furthermore, programs written in a high-level language such as C improves the clarity of the program’s functionality, as assembly language can be obtuse, especially for readers new to assembly language programming. This also prepares you for the hardware interfacing chapters of this book, which uses the C language for all its example programs. If you are new to the C language, do not worry, as C language statements are introduced gradually and are fully explained. A previous exposure to any modern programming language is all that is necessary to understand the C program examples used in this book. Example C programs only use those C language statements necessary to demonstrate PIC24 μC capabilities and do not attempt to cover the entire C language. A program called a compiler is generally used to perform the translation from C to assembly language. This book uses the MPLAB XC16 compiler for this task when C programs illustrating microcontroller hardware interfacing are covered beginning in Chapter 8.

Table 3.5 shows the standard C data types of char, short, int, long, and long long in their unsigned and signed varieties and how these types are referred to in this book. The standard C types represent different sized variables, with a char variable always requiring 8 bits (1 byte). A problem with the standard C data types is that the sizes of the int and long types are compiler-dependent. For example, in the C compiler used in this book for the PIC24 μC, the int and long types are 16 bits and 32 bits. However, these same types may be 32 bits and 64 bits for a different processor/compiler combination, which can create issues when porting C code. To address this problem, the C99 C standard (ISO/IEC 9899:1999) defined new integer data types that explicitly state the number of bits for each type. It has become common practice for C programmers, especially when writing code for microcontrollers, to use these new types because variable size affects both data and program storage requirements, along with program execution time. So, referring to Table 3.5 for an example, an unsigned int variable is declared as a uint16_t variable in any C program used in this book.

Table 3.5: C Data Types

C-to-PIC24 Assembly Language

A C program that uses the data transfer and arithmetic operations discussed so far is shown in Listing 3.1. Line numbers have been added for clarity, but would not be part of the actual C program source code.

Listing 3.1: A “Simple” C Program

(1)       uint8_t u8_i, u8_j, u8_k;

(2)       void main(void) {
(3)         u8_i = 100;           // u8_i = 100 (0x64)
(4)         u8_i = u8_i + 1;      // u8_i++, u8_i = 101 (0x65)
(5)         u8_j = u8_i;          // u8_j is 101 (0x65)
(6)         u8_j = u8_j - 1;      // u8_j , u8_j is 100 (0x64)
(7)         u8_k = u8_j + u8_i;   // u8_k = 201 (0xC9)
(8)       }

The C language is case sensitive, with all reserved keywords, such as void, being lowercase. Comments begin with two // characters and can start anywhere on a line (this is actually a C++ language comment now included in the C99 standard, so it is accepted by modern C compilers). Comments can also be delimited by a starting /* and an ending */ with the delimiters spanning multiple lines. Simple C statements are terminated by a semicolon (;). Compound statements, which are composed of multiple simple C statements, are bracketed by {}. Line 1 defines three variables of type uint8_t, which from Table 3.5 means that each variable is 8 bits, or 1 byte, and represents unsigned data (unsigned char). The unsigned modifier tag combined with the 8-bit data size gives a value range of 0 to 255 for each variable. Chapter 5 discusses the difference between unsigned and signed data types and the effect this has on arithmetic operations. C code examples in this text use prefixes (i.e. u8_) as part of variable names as a strong hint of their data type in order to improve code readability. Line 2 defines the entry point for the C program, which must be named main. The (void) before the main label indicates that this main function returns no value, which will always be the case in this book. The (void) after the main label indicates that main receives no parameters, which will always be the case for C programs in this book (Chapter 6 discusses the details of parameter passing and value return for C functions in more detail). The body of the main code is a compound C statement, enclosed by {}. Line 3 assigns the constant value 100 to the variable u8_i. Line 4 increments u8_i by 1; u8_i contains the value 101 after execution of this statement. The C statement u8_i++, where ++ is the C increment operator, could be used instead of u8_i= u8_i + 1. Line 5 copies the value of u8_i to u8_j. Line 6 decrements u8_j, so u8_j contains the value 100 after execution of this statement. The statement u8_j-- could be used instead of u8_j= u8_j - 1. Line 7 adds u8_j to u8_i and places the result in u8_k.

The first step in translating (i.e., compiling) the program in Listing 3.1 to PIC24 assembly language is to choose locations for the variables u8_i, u8_j, and u8_k. This can be any data RAM location that is not assigned to a special function register. You could also simply use working registers to represent variables, and you’ll see how to do this when C functions are covered in Chapter 6. This example uses data RAM for variables. For simplicity, the example uses the first available data RAM locations, which are 0x1000 for u8_i, 0x1001 for u8_j, and 0x1002 for u8_k (recall that data RAM locations 0x0000 through 0x0FFF are reserved for special function registers). Figure 3.20 shows the program of Listing 3.1 translated to PIC24 assembly language.

Figure 3.20
The contents of main in the “simple” C program compiled to PIC24 assembly language

The compilation is straightforward when only one line is considered at a time. Optimizing C compilers (and expert assembly language programmers) consider multiple C language statements at a time during compilation in an effort to reduce the total number of instructions, and it may be difficult to correlate the final assembly code with the original C language statements. This book does not expect you to become an expert assembly language programmer; this only occurs after a considerable amount of time is spent crafting assembly language programs. Instead, this book strives for clarity and understanding, so it will always perform C-to-PIC24 assembly language translation in the most straightforward manner possible. Some comments on the assembly code in Figure 3.20 are:

Byte operations (.b) are used because u8_i, u8_j, and u8_k are uint8_t (1 byte) variables.

You might wonder why both W0 and WREG are used in the assembly language for the first C statement u8_i = 100. Only WREG can be used in a mov instruction that is a byte operation and specifies a file register address, so the destination for the constant value 100 must be specified as WREG. However, the mov instruction with an immediate operand requires one of the working registers, W0 through W15, as its destination. It is a syntax error to write mov.b #100, WREG and also a syntax error to write mov.b W0,i even though W0 and WREG are physically the same register.

You might wonder why the assembly statement mov.b 0x1000, 0x1001 is not used for the u8_j = u8_i operation. The answer is simple—there is no PIC24 mov instruction that allows file register addresses to be specified for both the source and destination operands. Instead, u8_i must be copied to a working register first, and then copied from the working register to u8_j.

Of the C language statements in Figure 3.20, the statement u8_k = u8_j + u8_i is the most difficult, and this implementation uses five PIC24 instructions. The first step copies the operands u8_i and u8_j to registers W1 (u8_i) and W0 (u8_j). Copying u8_i to W1 required two instructions (mov.b 0x1000, WREG followed mov.b W0, W1) because a byte operand can only be copied into WREG (W0) from data memory. The code then uses a three-operand add instruction for the addition, placing the result into W0. Using W0 for the result was done because the result is copied to operand u8_k with the next instruction, and only WREG (W0) can be used for a byte copy to data memory. Observe that the instruction add.b W0, W1, W0 can be replaced by add.b W1, W0, W0 since addition is commutative (u8_j + u8_i is the same as u8_i + u8_j). You could have also used a different register such as W2 to hold u8_i; the choice is arbitrary.

Listing 3.2 shows an alternate implementation for u8_k = u8_i + u8_i that requires only three instructions as it uses the two-operand form of the add instruction. Typically, use of two-operand ALU instructions requires fewer instructions for implementing small C code fragments. Unfortunately, we have found that new assembly language programmers make mistakes more often using two-operand ALU instruction forms as they forget which register is the source and which is the destination as well as forgetting that one of the operands is overwritten by the result. As such, the assembly language examples will often sacrifice instruction efficiency by using three-operand ALU instruction forms for extra clarity. In many cases, we will show different assembly language implementations for the same C code fragment to illustrate that there are multiple approaches. We stress that new assembly language programmers should use the approach that they best understand and strive for code efficiency once coding experience has been gained.

Listing 3.2: Alternate Implementation of u8_k = u8_j + u8_i

mov.b 0x1000, WREG      ;WREG = u8_i
add.b 0x1001, WREG      ;WREG = u8_j + WREG = u8_j + u8_i
mov.b WREG, 0x1002      ;u8_k = WREG = u8_j + u8_i

Listing 3.3 shows an implementation for the C language statement u8_k = u8_j - u8_i with the only difference from the code in Figure 3.20 being the use of a three-operand sub instruction instead of the add instruction. Observe that the instruction sub.b W0, W1, W0 cannot be replaced by sub.b W1, W0, W0 since subtraction is not commutative (u8_j - u8_i is not the same as u8_i - u8_j).

Listing 3.3: Implementation of u8_k = u8_j – u8_i

mov.b 0x1000, WREG        ; WREG = u8_i
mov.b W0, W1              ; W1 = u8_i
mov.b 0x1001, WREG        ; WREG = u8_j
sub.b W0, W1, W0          ; W0 = W0 -  W1 = u8_j -  u8_i
mov.b WREG, 0x1002        ; u8_k = WREG = u8_j -  u8_i

The code in Figure 3.20 makes heavy use of WREG because of the need for byte operations to data memory and the use of file register addressing. There are many other possible assembly language implementations using addressing modes such as indirect addressing, but we reserve that discussion for a later chapter.

The PIC24 assembly language of Figure 3.20 is somewhat obtuse because memory addresses (0x1000, 0x1001, and 0x1002) are used instead of the variable names u8_i, u8_j, u8_k. Also, there is still the problem of translating the PIC24 instruction mnemonics to machine code, a process that is interesting the first time, boring the second time, and painful thereafter. A program called an assembler automatically converts instruction mnemonics to machine code. Microchip Technology Inc. provides the MPLAB X Integrated Design Environment (IDE), which contains an assembler and simulator for most Microchip microprocessors. Listing 3.4 gives the assembly language of Figure 3.20 written in a more readable form, and in a format compatible with the MPLAB X PIC24 assembler (the line numbers are not part of the source file).

Listing 3.4: MPLAB X Compatible Assembly Source Code for “Simple” C Example

(1)  .bss                        ;uninitialized data section
(2)  ; Reserve one byte (8 bits) for each variable using .space directive
(3)  u8_i:    .space 1  ; Allocating space (in bytes) to variable.
(4)  u8_j:    .space 1           ; Allocating space (in bytes) to variable.
(5)  u8_k:    .space 1           ; Allocating space (in bytes) to variable.
(6)  ; Code Section in Program Memory
(7)  .text                       ; Start of Code section
(8)  .global __reset:            ; first instruction located at __reset label
(9)  __reset:
(10)  ; u8_i = 100;
(11)  mov.b #100, W0             ; W0 = 100
(12)  mov.b WREG,u8_i            ; u8_i = 100
(13)
(14)  ; u8_i = u8_i + 1;
(15)  inc.b u8_i                 ; u8_i = u8_i + 1
(16)
(17)  ; u8_j = u8_i
(18)  mov.b u8_i, WREG           ; W0 = u8_i

(19)  mov.b WREG, u8_j           ; u8_j = W0
(20)
(21)  ; u8_j = u8_j     1;
(22)  dec.b u8_j                 ; u8_j  = u8_j     1
(23)
(24)  ;u8_k = u8_j + u8_i
(25)  mov.b u8_i, WREG           ; W0 = u8_i
(26)  mov.b W0, W1               ; W1 = u8_i
(27)  mov.b u8_j, WREG           ; W0 = u8_j
(28)  add.b W0, W1, W0           ; W0 = W0+W1 = u8_j + u8_i
(29)  mov.b WREG, u8_k           ; u8_k = W0
(30)
(31)  done:
(32)  goto done                  ; Placeholder for last line of executed code

The .bss statement in line 1 is called an assembler directive, which is an instruction to the assembler and not a PIC24 assembly language statement. Lines 1, 3-5, 7, and 8 are all assembler directives. Note that assembler directives are case-insensitive and have a period in front of the name. The .bss assembler directive indicates the start of a section that contains uninitialized data to be placed in data RAM. The three .space assembler directives that follow allocate one byte of space for each of the u8_i, u8_j, and u8_k variables. Observe that each .space directive begins with a label that indicates a variable name. A label is case-sensitive, ends with a colon, and is assigned either a data memory address or a program memory address by the assembler. The .space directives use the first available data memory location, and thus labels u8_i, u8_j, u8_k are assigned the data memory addresses 0x1000, 0x1001, and 0x1002, respectively. Using .space directives and labeling in this manner allow the u8_i, u8_j, u8_k variables to be referenced by name within PIC24 instructions instead of using absolute memory locations, which greatly improves code clarity. The .space directive can also be used to insert bytes between declarations of different sized variables (8-bit, 16-bit, and 32-bit) to ensure that 16-bit and 32-bit variables start on a word boundary. Line 11 contains an instruction mnemonic, which is case-insensitive. Labels, assembler directives, and instruction mnemonics may start in any column that is desired. A semicolon is used to start a comment.

The .text assembler directive at line 7 indicates the start of a section that contains instructions to be placed in program memory. There must be separate assembler directives, .bss and .text, for data memory and program memory because these are two different physical memories in the PIC24 architecture. A special label named __reset (line 9) is required by the PIC24 assembler for the first executable instruction in the program, which happens to be line 11 in this case. The .global __reset assembler directive at line 8 declares the label __reset as a global label, that is, a label that is accessible outside of the scope of this file.

Locations 0x0 through 0x01FF of program memory are reserved for reset handling, and trap and interrupt vectors (discussed in Chapter 9), so for this particular PIC24 CPU the first location available for instructions is 0x0200. At power up, the program counter is reset to 0x0. Thus, the first PIC24 instruction is fetched from program memory location 0x0, which is called the reset handler. The assembler automatically generates a goto __reset instruction that occupies the two words at program memory locations 0x0 and 0x2, which is the reason why the first executable instruction in the program must be labeled as __reset. In other words, the contents of locations 0x0 and 0x2 hold a goto statement to the start of the user program. The first instruction of the program may not be located at address 0x0200 as the assembler may place code at that location for initializing data memory and for other system functions based on compiler/assembler/linker options set in MPLAB X. However, it is certain that the first instruction of the program will be located in program memory somewhere after address 0x0200, and the goto __reset instruction at location 0x0 will cause a jump to the program on processor reset. Labels in program memory are used as symbolic names for instruction addresses, generally as the target of a change of control instruction, such as goto __reset. A label does not have to be on the same line as the instruction that it labels. The __reset label is on line 9 but it refers to the next instruction in the program, which is at line 11 (a comment is on line 10).

The instructions in lines 11–29 duplicate the instructions of Figure 3.20, except that the names u8_i, u8_j, and u8_k are used instead of absolute addresses. Lines 31 and 32 contain the infinite loop goto done, where the target address done is the location of the goto instruction. A microcontroller program never really ends; it must always be doing something, as there is no place for the program to go when it finishes! When a program exits on a personal computer, control is returned to the operating system, which is in an infinite loop waiting for input from the keyboard, mouse, or some other input device. A microcontroller program is also typically an infinite loop that is waiting on input from some external device such as a car engine, sensor array, and so forth. In this simple example, the program execution is trapped when it falls into the goto done infinite loop. Another method to halt program execution is to stop the processor clock; this is discussed in Chapter 8.

Listing 3.5 gives the machine code listing produced by the MPLAB X assembler for the assembly language program of Listing 3.4. The address column gives the program memory location in hex and the opcode column the machine code for the mnemonic to the right. In this case, the program was placed in program memory by MPLAB X beginning at location 0x0029E (this starting location may change depending on MPLAB X version and assembler/linker options). Not counting the reset vector at location 0x0, this program takes 12 PIC24 instructions to implement located in addresses 0x29E through 0x2B6. The number of required instruction words is 13 (12+1), because the goto done instruction at location 0x002B4 requires two instruction words.

Listing 3.5: Machine Code Listing for Simple C Program

address opcode      label     mnemonic
00000   04029E       GOTO _ _reset
00002   000000       NOP
0029E   B3C640      _ _reset   MOV #0x64, W0
002A0   B7F000       MOV.B WREG, u8_i
002A2   EC7000       INC.B u8_i
002A4   BFD000       MOV.B u8_i, WREG
002A6   B7F001       MOV.B WREG, u8_j
002A8   ED7001       DEC.B u8_j
002AA   BFD000       MOV.B u8_i, WREG
002AC   784080       MOV.B W0, W1
002AE   BFD001       MOV.B u8_j, WREG
002B0   404001       ADD.B W0, W1, W0
002B2   B7F002       MOV.B WREG, u8_k
002B4   0402B4       done    GOTO done
002B6   000000       NOP

The nop Instruction

The second word of the goto instructions in Listing 3.5 displays as a nop instruction, which stands for “NO oPeration”. A nop simply causes the instruction word to be fetched and the PC to be incremented to the next instruction word. One machine code encoding of a nop has the upper 8 bits as 0 and the remaining bits as don’t-cares. All instructions that require two words have the second instruction word encoded such that the upper 8 bits are 0, as seen at locations 0x0002 and 0x021A in Listing 3.5. A second nop encoding has the upper 8 bits as 1, and the remaining bits as don’t-cares. This second nop encoding was chosen because program memory contains all 1s when the program memory is in a blank or erased state. This means that the values of 0xFFFFFF and 0x000000 are both treated as nop instructions. These nop encodings were selected because any erased location contains 0xFFFFFF, and any memory location in the 4 Mi instruction word range that is not physically implemented returns a 0x000000 when read. In this way, if a program error causes a jump to a portion of program memory that is erased, continuous nop instructions (0xFFFFFF) are fetched until the program counter exceeds physical memory. Then, 0x000000 values (nop instructions again) are read until the PC wraps back to the reset location of 0x0, simulating a device reset. An internal register of the PIC24 μC can be checked by the startup code to determine if a physical reset actually occurred; and if not, then an error indicator can be displayed indicating that an anomalous reset condition occurred.

16-Bit (Word) Operations

Assume the C program of Listing 3.1 is changed to use uint16_t (16-bit) variables instead of uint8_t (8-bit) variables. This means that all operations using these variables now require 16-bit operations instead of 8-bit operations. Listing 3.6 gives the new assembly language implementation assuming uint16_t (16-bit) variables.

Listing 3.6: Simple C Program with 16-Bit Operations

(1)   .bss                     ; Uninitialized data section
(2)   ; Reserve two bytes (16 bits) for each variable using .space directive
(3)   u16_i:  .space 2         ; Allocating space (in bytes) to variable.
(4)   u16_j:  .space 2         ; Allocating space (in bytes) to variable.
(5)   u16_k:  .space 2         ; Allocating space (in bytes) to variable.
(6)   ; Code Section in Program Memory
(7)   .text                    ; Start of Code section
(8)   .global _ _reset:        ; First instruction located at _ _reset label
(9)   _ _reset:
(10)  ; u16_i = 2047;
(11)  mov #2047, W0            ; W0 = 2047
(12)  mov W0, u16_i            ; u16_i = 2047
(13)
(14)  ; u16_i = u16_i + 1;
(15)  inc u16_i                ; u16_i = u16_i + 1
(16)
(17)  ; u16_j = u16_i
(18)  mov u16_i, W0            ; W0 = u16_i
(19)  mov W0, u16_j            ; u16_j = W0
(20)
(21)  ; u16_j = u16_j   1;
(22)  dec u16_j                ; u16_j  = u16_j     1
(23)
(24)  ; u16_k = u16_j + u16_i
(25)  mov u16_j, W1            ; W1 = u16_j
(26)  mov u16_i, W2            ; W2 = u16_i
(27)  add W1, W2, W0           ; W0 = W1 + W2 = u16_j + u16_i
(28)  mov W0, u16_k            ; u16_k = W0
(29)
(30)  done:
(31)  goto done                ; Placeholder for last line of executed code

The changes in Listing 3.6 to illustrate 16-bit operations are:

The prefix for these variable names is changed from u8 to u16 resulting in variable names of u16_i, u16_j, and u16_k (this is not a required change, but is the variable naming procedure followed by this textbook).

The .space directives in lines 3–5 now reserve two bytes (16 bits) instead of 1 byte for each variable. This means that the addresses of the u16_i, u16_j, u16_k variables are now 0x1000, 0x1002, and 0x1004.

All of the byte (.b) operations in Listing 3.4 are removed so that 16-bit operations are performed.

All WREG references in mov instructions are replaced by W0 since WREG usage is only required for byte operations.

The constant value copied to the u16_i variable has been changed from 100 to 2047 to illustrate that a larger number range is available with 16-bit variables.

The W1 and W2 registers are used to hold the u16_j and u16_i variables, respectively, for the u16_k = u16_j + u16_i computation to illustrate that register choice is flexible. Observe that it now only takes four assembly instructions (lines 25-28) to implement this C statement instead of the five instructions used in Listing 3.4 because u16_j and u16_i can be copied directly to their respective registers instead of having to be copied first to WREG.

The program of Listing 3.6 requires 11 PIC24 instructions, which is one fewer than Listing 3.4, and thus executes slightly faster. This is the advantage of a 16-bit CPU architecture in that 16-bit operations are at least as efficient as 8-bit operations in terms of program memory size and execution time, with only data memory requirements increasing due to the larger variable size. However, if the C data types of Listing 3.1 are changed to uint32_t (32-bit) variables, then the assembly code of Listing 3.6 would change significantly, requiring approximately double the number of instructions and execution time. Operations on 32-bit data are examined in detail in Chapter 5.

Sample Question: Write a PIC24 assembly language fragment that implements the C statement “u16_k = u16_i + u16_j + 178;” where u16_k, u16_i, u16_j are all uint16_t variables.

Answer: One solution is:

(1)  mov u16_i, W1      ; W1 = u16_i
(2)  mov u16_j, W2      ; W2 = u16_j
(3)  mov #178, W3       ; W3 = 178
(4)  add W1, W2, W0     ; W0 = W1 + W2 = u16_i + u16_j
(5)  add W0, W3, W0     ; W0 = W0 + W3= (u16_i + u16_j) + 178
(6)  mov W0, u16_k      ; u16_k = W0 = u16_i + u16_j + 178

Observe that a single C statement may require several PIC24 assembly language statements, as several operations can be written in one C statement. Translating the C statement to PIC24 statements requires that you decompose the C statement into steps that the PIC24 can accomplish. In this case, the solution first copied the operands u16_i, u16_j, and the constant 178 into registers (the input step). It then performed the sum using two addition instructions (the process step). The last step copied the result back into variable u16_k (the output step). The input/process/output steps will be the standard approach to implementing C language statements.

Another solution that uses the two-operand add instruction form is:

(1)   mov u16_j, W0     ; W0 = u16_j
(2)   add #178, W0      ; W0 = W0 + 178 = u16_j + 178
(3)   add u16_i, WREG   ; W0 = u16_i + W0 = u16_i + (u16_j + 178)
(4)   mov W0, u16_k     ; u16_k = W0 = u16_i + u16_j + 178

This implementation requires two fewer instructions, but the input/process/output steps are not as clear. It can also be confusing that the add instruction in line 2 requires W0 usage, while the add instruction of line 3 requires WREG usage. Another drawback of this approach is that the two-operand add of line 3 requires the u16_i operand to be in near RAM. Because of these issues, the assembly language examples will make limited use of two-operand ALU instructions.

Sample Question: A neophyte assembly language programmer translated the C code fragment:

uint16_t u16_k, u16_j;
u16_k = u16_j + 1;

to the assembly language sequence of:

(1)  inc u16_j
(2)  mov u16_j, W1
(3)  mov W1, u16_k

What is wrong with this?

Answer: The inc u16_j instruction modifies the variable u16_j. The C statement u16_k = u16_j + 1 only modifies u16_k; the variable u16_j is not modified. One correct solution following the input/process/output approach is:

(1)  mov u16_j, W0      ; W0 = u16_j
(2)  mov #1, W1         ; W1 = 1
(3)  add W0, W1, W0     ; W0 = W0 + W1 = u16_j + 1
(4)  mov W0, u16_k      ; u16_k = W0 = u16_j + 1

Observe that variable u16_j is unmodified by this implementation. A more efficient implementation is:

(1)  mov u16_j, W0      ; W0 = u16_j
(2)  add W0, #1, W0     ; W0 = W0 + 1 = u16_j + 1
(3)  mov W0, u16_k      ; u16_k = W0 = u16_j + 1

This saves one instruction by using the three-operand add form that allows an immediate operand. However, you must be careful when using this add form as the immediate value is limited to a 5-bit range (0 to 31). If the C statement was u16_k = u16_j + 50 then the first approach is valid as the statement mov #50, W1 is legal since any 16-bit constant can be used for the immediate operand. However, the second approach would be invalid, as the statement add W0, #50, W0 is illegal because the constant 50 does not fit in 5-bits (is greater than 31). In order to avoid the size restrictions of some specialized instruction forms, the safest approach for a constant is to copy it into a register before use.

An implementation for the C statement u16_k = u16_j + 1 that uses only two instructions is:

(1)  inc u16_j, WREG         ; WREG = u16_j + 1
(2)  mov W0, u16_k           ; u16_k = W0 = u16_j + 1

As previously stated, it is strongly suggested that new assembly language programmers use the approach that is best understood and should not be overly concerned about implementation efficiency.

The Clock and Instruction Execution

The clock signal that controls instruction execution in the PIC24E μC supports a maximum frequency of 140 MHz for the devices available during the writing of this book. Methods for generating this clock signal and setting its frequency are discussed in Chapter 8. The symbol FOSC refers to the clock signal’s frequency and TOSC refers to its period. A finite-state machine within the PIC24E μC controls the fetching, decoding, and execution of instructions. Most instructions require one instruction cycle to execute, with one instruction cycle equal to two clock cycles (two TOSC periods). Instruction execution frequency is referred to as FCY and is equal to FOSC/2. A single instruction cycle requires a time of 1/FCY. For numerical simplicity, assume a 100 MHz clock, which has a period of 10 ns (see Chapter 1). One instruction cycle takes two clocks, so this is a time of 2 * 10 ns = 20 ns. This means that a PIC24E μC with a 100 MHz clock can execute instructions at a rate of approximately 50 million instructions per second (MIPS) as 1/20 ns = 50 x 10⁶. (You can also simply divide the clock frequency by two in order to get the MIPS of PIC24E μC.) The actual instruction execution rate will be somewhat lower than this, as any instruction that causes the program counter to change value, such as a goto instruction, requires four instruction cycles (8 FOSC clocks) to execute. There are other conditions, such as register dependencies in effective address calculations, which can require an extra instruction cycle—this is discussed in Chapter 6. The instruction table contained in Appendix A gives the number of instruction cycles required for each instruction.

Sample Question: How many clock cycles and instruction cycles are required to execute from location 0x0029E through 0x0002B2 of Listing 3.5? Assuming FOSC = 100 MHz, how long do these instructions take to execute?

Answer: There are 11 instructions total, and each instruction takes one instruction cycle. So total instruction cycles = 11, and total clock cycles is 11 * 2 = 22.

Total execution time = 1/FOSC * # of clock cycles

= 1/(100 MHz) * 22 = 1/(100 x 10⁶) * 22 = (0.01 x 10^-6) * 22 = (0.22 x 10^-6) = 0.22 μs = 220 ns

Summary

This chapter introduced the basic program and memory architecture of the PIC24 microcontroller and discussed a few PIC24 instructions from the data transfer, arithmetic, and control classes. Two versions of a simple C program were converted to PIC24 assembly language, one version that operated on 8-bit unsigned data and a second version that used 16-bit unsigned data. This prepares you for the next chapter, in which additional arithmetic, shift, and logical operations on unsigned 8-bit and 16-bit data are covered, as well as conditional execution and loop structures.

Review Problems

The answers to the odd-numbered problems can be found in Appendix C.

1. Convert the instruction mov 0x100A, WREG to machine code.

2. Convert the instruction mov W3, W5 to machine code.

3. Convert the machine code 0x780108 to a PIC24 instruction. (Hint: This is a mov instruction.)

4. Convert the machine code 0xB3C228 to a PIC24 instruction. (Hint: This is a mov instruction.).

5. The instruction add W0, #35, W2 is illegal. Why?

6. What value is placed in W1 after the following instruction execution: sub W1, W1, W1. (Hint: It is not necessary to know the value of W1).

7. What register is modified by the following instruction: add W3, W2, W4 ?

8. What register is modified by the following instruction: add u16_j, WREG ?

9. Write a subtract instruction that performs the following 16-bit operation: W6 = W3 - W5.

10. Given a 25 MHz FOSC, how long does it take the instruction goto done to execute in nanoseconds?

11. Given a 20 MHz FOSC, how long does it take the instruction mov W2, W4 to execute in nanoseconds?

12. Write a PIC24 instruction sequence that accomplishes u16_k = u16_i - u16_j - 32, where u16_i, u16_j, and u16_k are uint16_t variables.

13. Write a PIC24 instruction sequence that accomplishes u8_k = u8_i - u8_j -32, where u8_i, u8_j, and u8_k are uint8_t variables.

14. Write a PIC24 instruction sequence that copies data RAM locations 0x1000 through 0x1007 (8 bytes) to locations 0x1800 through 0x1807.

15. How many add W0, W1, W2 instructions are executed in 1 second assuming a 25 MHz FOSC?

For the remaining problems, assume the memory/register contents of Table 3.6 at the start of each instruction and give the modified memory location or register and its content.

Table 3.6: Register Memory Contents for Sample Problems

16. mov W2, 0x1008

17. mov.b 0x1001, WREG

18. mov #0x1001, W1

19. mov.b [W3], [W4]

20. mov [W0], W4

21. mov.b #0x20, W3

22. add W0, [W0], W1

23. dec 0x1004, WREG

24. add 0x1002

25. add.b 0x1000, WREG

26. sub W0, W1, [W0]

27. sub.b W4, #10, [W0]

28. dec2 0x1008

29. dec2.b 0x1008

30. inc W0, [W3]

31. inc2.b 0x1009

32. inc2 0x1008, WREG

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3 Introduction to the PIC24 Microcontroller Family

Create new playlist

Sign In

Sign Up

CHAPTER 3INTRODUCTION TO THE PIC24 MICROCONTROLLER FAMILY