Chapter 12. Power Amplifiers

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12. Power Amplifiers

Power amplifiers are the most power-hungry building block of RF transceivers and pose difficult design challenges. In the past ten years, the design of PAs has evolved considerably, drawing upon relatively complex transmitter architectures to improve the trade-off between linearity and efficiency. This chapter describes the analysis and design of PAs with particular attention to the limitations that they impose on the transmitter chain. A thorough treatment of PAs would require a book of its own, but our objective here is to lay the foundation. The reader is referred to [1, 2] for further details. The chapter outline is shown below.

12.1 General Considerations

As the first step in our study, we consider a transmitter delivering 1 W (+30 dBm) of power to a 50-Ω antenna. The peak-to-peak voltage swing, V_pp, at the antenna reaches 20 V and the peak current through the load, 200 mA. For a common-source (or common-emitter) stage to drive the load directly, the configurations shown in Figs. 12.1(a) and (b) require a supply voltage greater than V_pp. However, if the load is realized as an inductor [Fig. 12.1(c)], the drain ac voltage exceeds V_DD, even reaching 2V_DD (or higher). While allowing a lower supply voltage, the inductive load does not relax the “stress” on the transistor; the maximum drain-source voltage experienced by M₁ is still at least 20 V (10 V above V_DD = 10 V) if the stage must deliver 1 W to a 50-Ω load.

Figure 12.1 CS stages with (a) resistive, (b) current source, and (c) inductive load.

The above example illustrates a fundamental issue in PA design, namely, the trade-off between the output power and the voltage swing experienced by the output transistor. It can be proven that the product of the breakdown voltage and f_T of silicon devices is around 200 GHz · V [3]. Thus, transistors with an f_T of 200 GHz dictate a voltage swing of less than 1 V.

What is the peak current carried by M₁ in Fig. 12.1(c)? Assume L₁ is large enough to act as an ac open circuit at the frequency of interest, in which case it is called an “RF choke” (RFC).

Solution:

If L₁ is large, it carries a constant current, I_L₁ (why?). If M₁ begins to turn off, this current flows through R_L, creating a positive peak voltage of I_L₁R_L [Fig. 12.2(a)]. Conversely, if M₁ turns on completely, it must “sink” both the inductor current and a negative current of I_L₁ from R_L so as to create a peak voltage of −I_L₁R_L [Fig. 12.2(b)]. The peak current through the output transistor is therefore equal to 400 mA.

Figure 12.2 Output voltage waveform in a CS stage (a) when current flows from inductor to R_L, (b) when current flows from R_L to transistor.

In order to reduce the peak voltage experienced by the output transistor, a “matching network” is interposed between the PA and the load [Fig. 12.3(a)]. This network transforms the load resistance to a lower value, R_T, so that smaller voltage swings still deliver the required power.

Figure 12.3 (a) Impedance transformation by a matching network, (b) realization by a transformer.

The PA in Fig. 12.3(a) must deliver 1 W to R_L = 50Ω with a supply voltage of 1 V. Estimate the value of R_T.

Solution:

The peak-to-peak voltage swing, V_pp, at the drain of M₁ is approximately equal to 2 V. Since

(12.1)-(12.2)

we have

(12.3)

The matching network must therefore transform R_L down by a factor of 100. Figure 12.3(b) shows an example, where a lossless transformer having a turns ratio of 1:10 converts a 2- V_pp swing at the drain of M₁ to a 20-V_pp swing across R_L.¹ From another perspective, the transformer amplifies the drain voltage swing by a factor of 10.

The need for transforming the voltage swings means that the current generated by the output transistor must be proportionally higher. In the above example, the peak current in the primary of the transformer reaches 10 × 200 mA = 2 A. Transistor M₁ must sink both the inductor current and the peak load current, i.e., 4 A!

Plot V_X and V_out in Fig. 12.1(c) as a function of time if M₁ draws enough current to bring V_X near zero. Assume sinusoidal waveforms. Also, assume L₁ and C₁ are ideal and very large.

Solution:

In the absence of a signal, V_X = V_DD and V_out = 0. Thus, the voltage across C₁ is equal to V_DD. We also observe that, in the steady state, the average value of V_X must be equal to V_DD because L₁ is ideal and therefore must sustain a zero average voltage. That is, if V_X goes from V_DD to near zero, it must also go from V_DD to about 2V_DD so that the average value of V_X is equal to V_DD (Fig. 12.4). The output voltage waveform is simply equal to V_X shifted down by V_DD.

Figure 12.4 Drain and output voltages in an inductively-loaded CS stage.

12.1.1 Effect of High Currents

The enormous currents flowing through the output device and the matching network are one of the difficulties in the design of power amplifiers and the package. If the output transistor is chosen wide enough to carry a large current, then its input capacitance is very large, making the design of the preceding stage difficult. As depicted in Fig. 12.5, we may deal with this issue by interposing a number of tapered stages between the upconversion mixer(s) and the output stage. However, as explained in Chapter 4, the multiple stages tend to limit the TX output compression point. Moreover, the power consumed by the driver stages may not be negligible with respect to that of the output stage.

Figure 12.5 Tapering in a TX chain.

Another issue arising from the high ac currents in PAs relates to the package parasitics. The following example illustrates this point.

The output transistor in Fig. 12.3(b) carries a current varying between 0 and 4 A at a frequency of 1 GHz. What is the maximum tolerable bond wire inductance in series with the source of the transistor if the voltage drop across this inductance must remain below 100 mV?

Solution:

The drain current of M₁ can be approximated as

(12.4)

where I₀ = 2 A and ω₀ = 2π(1 GHz). The voltage drop across the source inductance, L_S, is given by

(12.5)

reaching a peak of L_Sω₀I₀. For this drop to remain below 100 mV, we have

(12.6)

This is an extremely small inductance. (A single bond wire’s inductance typically exceeds 1 nH.)

What is the effect of package parasitics? The inductance in series with the source degenerates the transistor, thereby lowering the output power. Moreover, ground and supply inductances may create feedback from the output to the input of the PA chain, causing ripple in the frequency response and even instability.

The large currents can also lead to a high loss in the matching network. The devices comprising this network—especially the inductors—suffer from parasitic resistances, thus converting the signal energy to heat. For this reason, the matching network for high-power applications is typically realized with off-chip low-loss components.

12.1.2 Efficiency

Since PAs are the most power-hungry block in RF transceivers, their efficiency is critical. A 1-W PA with 50% efficiency draws 2 W from the battery—much more than the rest of the transceiver does.

The efficiency of the PAs is defined by two metrics. The “drain efficiency” (for FET implementations) or “collector efficiency” (for bipolar implementations) is defined as

(12.7)

where P_L denotes the average power delivered to the load and P_supp the average power drawn from the supply voltage. In some cases, the output stage may have a relatively low power gain, e.g., 3 dB, requiring a high input power. A quantity embodying this effect is the “power-added efficiency” (PAE), defined as

(12.8)

where P_in is the average input power.

Discuss the PAE of the CS stage shown in Fig. 12.3.

Solution:

At low to moderate frequencies, the input impedance is capacitive and hence the average input power is zero. (Of course, driving a large capacitance is still difficult.) Thus, PAE = η. At high frequencies, the feedback due to the gate-drain capacitance introduces a real part in Z_in, causing the input port to draw some power.² Consequently, PAE < η. In stand-alone PAs, we may deliberately introduce a 50-Ω input resistance, in which case PAE < η.

12.1.3 Linearity

As explained in Chapter 3, the linearity of PAs becomes critical for some modulation schemes. In particular, PA nonlinearity leads to two effects: (1) high adjacent channel power as a result of spectral regrowth, and (2) amplitude compression. For example, QPSK modulation with baseband pulse shaping may suffer from the former and 16QAM from the latter. In some cases, AM/PM conversion may also be problematic.

The PA nonlinearity must be characterized with respect to the modulation scheme of interest. However, circuit-level simulations with actual modulated inputs take a very long time if they must produce an output spectrum that accurately reveals the ACPR (Chapter 3). Similarly, circuit-level simulations that quantify the effect of amplitude compression (i.e., the bit error rate) prove very cumbersome. For this reason, the PA characterization begins with two generic tests of nonlinearity based on unmodulated tones: intermodulation and compression. If employing two sufficiently large tones, the former provides some indication of ACPR. The amplitude of the tones is chosen such that each main component at the output is 6 dB below the full power level, thus producing the maximum desired output voltage swing when the two tones add in-phase [Fig. 12.6(a)]. For compression, a single tone is applied and its amplitude gradually increases so as to determine the output 1-dB compression point [Fig. 12.6(b)].

Figure 12.6 PA characterization by (a) two-tone test, (b) compression.

The above tests yield a first-order estimate of the PA nonlinearity. However, a more rigorous characterization is eventually necessary. Since the PA contains many storage elements, its nonlinearity cannot be simply expressed as a polynomial. As explained in Chapter 2, a Volterra series can represent dynamic nonlinearities, but it tends to be rather complex. An alternative approach models the nonlinearity as follows [4]. Suppose the modulated input is of the form

(12.9)

Then, the output also contains amplitude and phase modulation and can be written as

(12.10)

We now make a “quasi-static” approximation. If the input signal bandwidth is much less than the PA bandwidth, i.e., if the PA can follow the signal dynamics closely, then we can assume that both A(t) and Θ(t) are nonlinear static functions of only the input amplitude, a(t). That is,

(12.11)

where A[a(t)] and Θ[a(t)] represent “AM/AM conversion” and “AM/PM conversion,” respectively [4]. For example, A and Θ are found to satisfy the following empirical equations:

(12.12)

(12.13)

where α_j and β_j are fitting parameters [4]. Illustrated in Fig. 12.7(a), A(a) is similar to the characteristic shown in Fig. 12.6(b) (but declines for high input levels). The AM/PM conversion function can also be obtained relatively easily by applying a tone at the PA input and measuring the PA phase shift as a function of the input amplitude.

Figure 12.7 Characteristics for AM/AM and AM/PM conversion.

The reader may wonder why the foregoing model is valid. Indeed, no analytical proof appears to have been offered to justify this model. Nonetheless, it has been experimentally verified that the model provides reasonable accuracy if the input signal bandwidth remains much smaller than the PA bandwidth. Note that for a cascade of stages, the overall model may be quite complex and the behavior of A and Θ quite different.

With A(a) and Θ(a) obtained from circuit simulations, the PA can be modeled by Eq. (12.11) and studied in a more efficient behavioral simulator, e.g., MATLAB. Thus, the effect of the PA nonlinearity on ACPR or the quality of signals such as OFDM waveforms can be quantified.

Another PA nonlinearity representation, called the “Rapp model” [5], is expressed as follows:

(12.14)

where α denotes the small-signal gain around V_in = 0, and V₀ and m are fitting parameters. Dealing with only static nonlinearity, this model has become popular in integrated PA design. We return to this model in our back-off calculations in Chapter 13. Other PA modeling methods are described in [6].

12.1.4 Single-Ended and Differential PAs

Most stand-alone PAs have been designed as a cascade of single-ended stages. Two reasons account for this choice: the antenna is typically single-ended, and single-ended RF circuits are much simpler to test than their differential counterparts.

Single-ended PAs, however, suffer from two drawbacks. First, they “waste” half of the transmitter voltage gain because they sense only one output of the upconverter [Fig. 12.8(a)]. This issue can be alleviated by interposing a balun between the upconverter and the PA [Fig. 12.8(b)]. But the balun introduces its own loss, especially if it is integrated on the chip, limiting the voltage gain improvement to a few decibels (rather than 6 dB).

Figure 12.8 Upconverter/PA interface with (a) single-ended or, (b) balun connection.

The second drawback of single-ended PAs stems from the very large transient currents that they pull from the supply to the ground. As shown in Fig. 12.9(a), the supply bond wire inductance, L_B₁, alters the resonance or impedance transformation properties of the output network if it is comparable with L_D. Moreover, L_B₁ allows some of the output stage signal to travel back to the preceding stage(s) through the V_DD line, causing ripple in the frequency response or instability. Similarly, the ground bond wire inductance, L_B₂, degenerates the output stage and introduces feedback.

Figure 12.9 (a) Feedback in a single-ended PA due to bond wires, (b) less problematic situation in a differential PA.

By contrast, a differential realization greatly eases the above two issues. Illustrated in Fig. 12.9(b), such a topology draws much smaller transient currents from V_DD and ground lines, exhibiting less sensitivity to L_B₁ and L_B₂ and creating less feedback. The degeneration issue quantified in Example 12.4 is also relaxed considerably.

While the use of a differential PA ameliorates both the voltage gain and package parasitic issues, the PA must still drive a single-ended antenna in most cases. Thus, a balun must now be inserted between the PA and the antenna (Fig. 12.10).

Figure 12.10 Use of a balun between the PA and antenna.

Suppose a given balun design has a loss of 1.5 dB. In which one of the transmitters shown in Figs. 12.8(b) and 12.10 does this loss affect the efficiency more adversely?

Solution:

In Fig. 12.8(b), the balun lowers the voltage gain by 1.5 dB but does not consume much power. For example, if the power delivered by the upconverter to the PA is around 0 dBm, then a balun loss of 1.5 dB translates to a heat dissipation of 0.3 mW. In Fig. 12.10, on the other hand, the balun experiences the entire power delivered by the PA to the load, dissipating substantial power. For example, if the PA output reaches 1 W, then a balun loss of 1.5 dB corresponds to 300 mW. The TX efficiency therefore degrades more significantly in the latter case.

Another useful property of differential PAs is their lower coupling to the LO and hence reduced LO pulling (Chapter 4). If propagating symmetrically toward the LO, the differential waveforms generated by each stage of the PA tend to cancel. Of course, if the PA incorporates symmetric inductors, then the problem of coupling remains (Chapter 7).

The trade-offs governing the choice of single-ended and differential PAs has led to two schools of thought: some TX designs are based on fully-differential circuits with an on-chip or off-chip balun preceding the output matching network, while others opt for a single-ended PA—with or without a balun following the upconverter.

12.2 Classification of Power Amplifiers

Power amplifiers have been traditionally categorized under many classes: A, B, C, D, E, F, etc. An attribute of classical PAs is that both the input and the output waveforms are considered sinusoidal. As we will see in Section 12.3, if this assumption is avoided, a higher performance can be achieved.

In this section, we describe classes A, B, and C, emphasizing their merits and drawbacks with respect to integrated implementation.

12.2.1 Class A Power Amplifiers

Class A amplifiers are defined as circuits in which the transistor(s) remain on and operate linearly across the full input and output range. Shown in Fig. 12.11 is an example. We note that the transistor bias current is chosen higher than the peak signal current, I_p, to ensure that the device does not turn off at any point during the signal excursion.

Figure 12.11 Class A stage.

The reader may wonder how we define “linear operation” here. After all, ensuring that the transistor is always on does not necessarily imply that the PA is sufficiently linear: if in Fig. 12.11, I₁ = 5I₂, the transistor transconductance varies considerably from t₁ to t₂ while the definition of class A seems to hold. This is where the definition of class A becomes vague. Nonetheless, we can still assert that if linearity is required, then class A operation is necessary.

Let us now compute the maximum drain (collector) efficiency of class A amplifiers. To reach maximum efficiency, we allow V_X in Fig. 12.11 to reach 2V_DD and nearly zero. Thus, the power delivered to the matching network is approximately equal to , which is also delivered to R_L if the matching network is lossless. Also, recall from Example 12.1 that the inductive load carries a constant current of V_DD/R_in from the supply voltage. Thus,

(12.15)-(12.16)

The other 50% of the supply power is dissipated by M₁ itself.

Is the foregoing calculation of efficiency consistent with the assumption of linearity in class A stages?

Solution:

No, it is not. With a sinusoidal input, V_X in Fig. 12.11 reaches 2V_DD only if the transistor turns off. This ensures that the current swing delivered to the load goes from zero to twice the bias value.

It is important to recognize the assumptions leading to an efficiency of 50% in class A stages: (1) the drain (collector) peak-to-peak voltage swing is equal to twice the supply voltage, i.e., the transistor can withstand a drain-source (or collector-emitter) voltage of 2V_DD with no reliability or breakdown issues;³ (2) the transistor barely turns off, i.e., the nonlinearity resulting from the very large change in the transconductance of the device is tolerable; (3) the matching network interposed between the output transistor and the antenna is lossless.

Explain why low-gain output stages suffer from a more severe efficiency-linearity trade-off.

Solution:

Consider the two scenarios depicted in Fig. 12.12. In both cases, for M₁ to remain in saturation at t = t₁, the drain voltage must exceed V₀ + V_p,in − V_TH. In the high-gain stage of Fig. 12.12(a), V_p,in is small, allowing V_X to come closer to zero than in the low-gain stage of Fig. 12.12(b).

Figure 12.12 Nonlinearity in a (a) high-gain and (b) low-gain stage.

The above example indicates that the minimum drain voltage may not be negligible with respect to V_DD, yielding an output swing less than 2V_DD. We must therefore compute the efficiency for lower output signal levels. The result also proves useful in transmitters with a variable output power. For example, we note from Chapter 4 that CDMA networks require that the mobile continually adjust its transmitted power so that the base station receives an approximately constant level.

Suppose the PA in Fig. 12.11 must deliver a peak voltage swing of V_p to R_in, i.e., a power of to the antenna if the matching network is lossless. We consider three cases: (1) the supply voltage and bias current remain at the levels necessary for full output power and only the input signal swing is reduced; (2) the supply voltage remains unchanged but the bias current is reduced in proportion to the output voltage swing; (3) both the supply voltage and the bias current are reduced in proportion to the output voltage swing.

In the first case, the bias current is equal to V_DD/R_in hence and a power of is drawn from the battery. Consequently,

(12.17)-(12.18)

The efficiency thus falls sharply as the input and output voltage swings decrease.

In the second case, the bias current is reduced to that necessary for a peak swing of V_p, i.e., V_p/R_in. It follows that

(12.19)-(12.20)

Here, the efficiency falls linearly as V_p decreases and V_DD remains constant.

In the third case, the supply voltage is also scaled, ideally according to the relation V_DD = V_p. Thus,

(12.21)

While this case is the most desirable, it is difficult to design PA stages with a variable supply voltage. Figure 12.13 summarizes the results.

Figure 12.13 Efficiency as a function of peak output voltage for different scaling scenarios.

A student attempts to construct an output stage with a variable supply voltage as shown in Fig. 12.14. Here, M₂ operates in the triode region, acting as a voltage-controlled resistor, and C₂ establishes an ac ground at node Y. Can this circuit achieve an efficiency of 50%?

Figure 12.14 Output stage with variable supply voltage.

Solution:

No, it cannot. Unfortunately, M₂ itself consumes power. If the bias current is chosen equal to V_p/R_in, then the total power drawn from V_DD is still given by (V_p/R_in)V_DD regardless of the on-resistance of M₂. Thus, M₂ consumes a power of (V_p/R_in)R_on₂, where R_on₂ denotes its on-resistance.

Conduction Angle

It is sometimes helpful to distinguish PA classes by the “conduction angle” of their output transistor(s). The conduction angle is defined as the percentage of the signal period during which the transistor(s) remain on multiplied by 360°. In class A stages, the conduction angle is 360° because the output transistor is always on.

12.2.2 Class B Power Amplifiers

The definition of class B operation has changed over time! The traditional class B PA employs two parallel stages each of which conducts for only 180°, thereby achieving a higher efficiency than the class A counterpart. Figure 12.15 shows an example, where the drain currents of M₁ and M₂ are combined by transformer T₁. We may view the circuit as a quasi-differential stage and a balun driving the single-ended load. But class B operation requires that each transistor turn off for half of the period (i.e., the conduction angle is 180°). The gate bias voltage of the devices is therefore chosen approximately equal to their threshold voltage.

Figure 12.15 Class B stage.

Explain how T₁ combines the half-cycle current waveforms generated by M₁ and M₂.

Solution:

Using superposition, we draw the output network in the two half cycles as shown in Fig. 12.16. When M₁ is on, I_D₁ flows from node X, producing a current in the secondary that flows into R_L and generates a positive V_out [Fig. 12.16(a)]. Conversely, when M₂ is on and draws current from node Y, the secondary current flows out of R_L and generates a negative V_out [Fig. 12.16(b)].

Figure 12.16 Output network currents during (a) positive and (b) negative output half cycles.

If the parasitic capacitances are small and the primary and secondary inductances are large, then V_X and V_Y in Fig. 12.15 are also half-wave rectified sinusoids that swing around V_DD (Fig. 12.17). In Problem 12.3, we show that the swing above V_DD is approximately half that below V_DD, an undesirable situation because it results in a low efficiency. For this reason, the secondary (or primary) of the transformer is tuned by a parallel capacitance so as to suppress the harmonics of the half-wave rectified sinusoids at X and Y, allowing equal swings above and below V_DD.

Figure 12.17 Current and voltage waveforms in a class B stage.

Let us compute the efficiency of the class B stage shown in Fig. 12.15. Suppose each transistor draws a peak current of I_p from the primary. As explained in Example 12.10, this current flows through half of the primary winding (because the other half carries a zero current). Assuming the turns ratios shown in Fig. 12.18, we recognize that a half-cycle sinusoidal current, I_D₁ = I_p sin ω₀t, 0 < t < π/ω₀, produces a similar current in the secondary, but with the peak given by (m/n)I_p. Thus, the total current flowing through R_L in each full cycle is equal to I_L = (m/n)I_p sin ω₀t, producing an output voltage given by

(12.22)

and delivering an average power of

(12.23)

Figure 12.18 Class B circuit for efficiency calculation.

We must now determine the average power drawn from V_DD. The half-wave rectified current drawn by each transistor has an average of I_p/π (why?). Since two of these current waveforms are drawn from V_DD in each period, the average power provided by V_DD is equal to

(12.24)

Dividing Eqs. (12.23) by (12.24) gives the drain (collector) efficiency of class B stages:

(12.25)

As expected, η is a function of I_p.

In our last step, we calculate the voltage swings at X and Y in the presence of a resonant load in the secondary (or primary). Since the resonance suppresses the higher harmonics of the half-wave rectified cycles, V_X and V_Y resemble sinusoids that are 180° out of phase and have a dc level equal to V_DD (Fig. 12.19). That is,

(12.26)

(12.27)

Figure 12.19 Class B circuit with resonant secondary network.

The primary of the transformer therefore senses a voltage waveform given by

(12.28)

which, upon experiencing a ratio of n/(2m), yields the output voltage:

(12.29)-(12.30)

It follows that

(12.31)

We choose V_p = V_DD to maximize the efficiency, obtaining from Eq. (12.25)

(12.32)-(12.33)

In recent RF design literature, class B operation often refers to half of the circuits shown in Figs. 12.15 and 12.18, with the transistor still conducting for only half a cycle. Such a circuit, of course, is quite nonlinear but still has a maximum efficiency of π/4.

As mentioned in Section 12.1.4, the use of an on-chip balun at the PA output lowers the efficiency. For power levels above roughly 100 mW, an off-chip balun may be used if efficiency is critical.

Class AB Power Amplifiers

The term “class AB” is sometimes used to refer to a single-ended PA (e.g., a CS stage) whose conduction angle falls between 180° and 360°, i.e., in which the output transistor turns off for less than half of a period. From another perspective, a class AB PA is less linear than a class A stage and more linear than a class B stage. This is usually accomplished by reducing the input voltage swing and hence backing off from the 1-dB compression point. Nonetheless, the term class AB remains vague.

12.2.3 Class C Power Amplifiers

Our study of class A and B stages indicates that a smaller conduction angle yields a higher efficiency. In class C stages, this angle is reduced further (and the circuit becomes more nonlinear).

The class A topology of Fig. 12.11 can be modified to operate in class C. Depicted in Fig. 12.20(a), the circuit is biased such that M₁ turns on if the peak value of V_in raises V_X above V_TH. As illustrated in Fig. 12.20(b), V_X exceeds V_TH for only a fraction of the period, as if M₁ were stimulated by a narrow pulse. As a result, the transistor delivers a narrow pulse of current to the output every cycle. In order to avoid large harmonic levels at the antenna, the matching network must provide some filtering. In fact, the input impedance of the matching network is also designed to resonate at the frequency of interest, thereby making the drain voltage a sinusoid.

Figure 12.20 (a) Class C stage and (b) its waveforms.

The distinction between class C and one-transistor class B stages is in the conduction angle, θ. As θ decreases, the transistor is on for a smaller fraction of the period, thus dissipating less power. For the same reason, however, the transistor delivers less power to the load.

If the drain current of M₁ in Fig. 12.20(a) is assumed to be the peak section of a sinusoid and the drain voltage a sinusoid having a peak amplitude of V_DD, then the efficiency can be obtained as [7]

(12.34)

Sketched in Fig. 12.21(a), this relation suggests an efficiency of 100% as θ approaches zero.

Figure 12.21 (a) Efficiency and (b) output power as a function of conduction angle.

The maximum efficiency of 100% is often considered a prominent feature of class C stages. However, another attribute that must also be taken into account is the actual power delivered to the load. It can be proved that [7]

(12.35)

Applying L’Hopital’s rule, the reader can prove that P_out falls to zero as θ approaches zero. In other words, for a given design, a class C stage provides a high efficiency only if it delivers a fraction of the peak output power (the power corresponding to full class A operation).

How can a class C stage provide an output power comparable to that of a class A design? The small conduction angle dictates that the output transistor be very wide so as to deliver a high current for a short amount of time. In other words, the first harmonic of the drain current must be equal in the two cases.

Determine the amplitude of the first harmonic of the transistor drain current in Fig. 12.20 for a conduction angle of θ.

Solution:

Consider the waveform shown in Fig. 12.22, where conduction begins at point A and ends at point B. The angle of the sinusoid reaches α at A and π − α at B such that π − α − α = θ and hence α = (π − θ)/2. The Fourier coefficients of the first harmonic are obtained as

(12.36)

(12.37)

Figure 12.22 Waveform in a class C stage for harmonic calculation.

where T₀ = 2π/ω₀ is the period. It follows that

(12.38)

(12.39)

and hence the first harmonic is expressed as

(12.40)

Note that a₁ → 0 as α → π/2. For example, if α = π/4, then a₁ ≈ 0.41I_p, the transistor must therefore be about 2.4 times as large as in a class-A stage for the same output power. Upon multiplication by R_in, this harmonic must yield a drain voltage swing of nearly 2V_DD.

In modern RF design, class C operation has been replaced by other efficient amplification techniques that do not require such large transistors.

12.3 High-Efficiency Power Amplifiers

The main premise in class A, B, and C amplifiers has been that the output transistor drain (or collector) current and voltage waveforms are sinusoidal (or a section of a sinusoid). If this premise is discarded, higher harmonics can be exploited to improve the performance. Described below are several examples of such techniques. The following topologies rely on specific output passive networks to shape the waveforms, minimizing the time during which the output transistor carries a large current and sustains a large voltage. This approach reduces the power consumed by the transistor and raises the efficiency. We note, however, that the large parasitics of on-chip inductors typically dictate that matching networks be realized externally, making “fully-integrated PAs” a misnomer.

12.3.1 Class A Stage with Harmonic Enhancement

Recall from our study of the class A stage in Fig. 12.11 that, for maximum efficiency, the transistor current swings by a large amount, experiencing nonlinearity. Thus, the current contains a significant second and/or third harmonic. Now suppose the matching network is designed such that its input impedance is low at the fundamental and high at the second harmonic. As illustrated in Fig. 12.23, the sum of the resulting voltage waveforms exhibits narrower pulses than the fundamental, reducing the overlap time between the voltage across and the current flowing in the output transistor. Consequently, the average power consumed by the output transistor decreases and the efficiency increases.

Figure 12.23 Example of second harmonic enhancement.

It is interesting that the above modification need not increase the harmonic content of the signal delivered to the load. The technique simply realizes different termination impedances for different harmonics to make the drain voltage approach a square wave.

As an example, consider the class A circuit shown in Fig. 12.24(a), where L₁, C₁ and C₂ form a matching network that transforms the 50-Ω load to Z₁ = 9 Ω + j0 at f = 850 MHz and Z₂ = 330 Ω + j0 at 2f = 1.7 GHz [8]. In this case, the second harmonic is enhanced by a factor of 37. Figure 12.24(b) shows the drain voltage. The circuit delivers a power of 2.9 W to the load with 73% efficiency and a third-order harmonic of −25 dBc [8]. Other considerations for harmonic termination are described in [9]. This enhancement technique can be applied to other PA classes as well.

Figure 12.24 (a) Class A stage with harmonic enhancement, (b) drain waveform.

12.3.2 Class E Stage

Class E stages are nonlinear amplifiers that achieve efficiencies approaching 100% while delivering full power, a remarkable advantage over class C circuits. Before studying class E PAs in detail, we first revisit the simple circuit of Fig. 12.3(a), shown in Fig. 12.25.

Figure 12.25 Output stage with switching transistor.

Suppose the output transistor in this circuit operates as a switch, rather than a voltage-dependent current source, ideally turning on and off abruptly. Called a “switching power amplifier,” such a topology achieves a high efficiency if (1) M₁ sustains a small voltage when it carries current, (2) M₁ carries a small current when it sustains a finite voltage, and (3) the transition times between the on and off states are minimized [10]. From (1) and (3), we conclude that the on-resistance of the switch must be very small and the voltage applied to the gate of M₁ must approximate a rectangular waveform. However, even with these two conditions, (2) may still be violated if M₁ turns on when V_X is high. Of course, in practice it is difficult to obtain sharp input transitions at high frequencies.

It is important to understand the fundamental difference between the PAs studied in previous sections and the switching stage of Fig. 12.25: in the former, the output matching network is designed with the assumption that the transistor operates as a current source, whereas in the latter, this assumption is not necessary. If the transistor is to remain a current source, then the minimum value of the drain voltage and the maximum value of the gate voltage must be precisely controlled such that the transistor does not enter the triode region. The minimum required drain-source voltage translates to a lower efficiency even if all of the devices and waveforms are ideal. By contrast, in switching amplifiers the drain voltage can approach zero (or even a somewhat negative value).

A serious dilemma in nonlinear PA design is that the gate of the output device must be switched as abruptly as possible so as to maximize the efficiency [Fig. 12.26(a)], but the large output transistor typically necessitates resonance at its gate, inevitably receiving a nearly sinusoidal waveform [Fig. 12.26(b)].

Figure 12.26 (a) Switching stage with sharp input waveform, (b) gradual waveform due to resonance.

Class E amplifiers deal with the finite input and output transition times by proper load design. Shown in Fig. 12.27(a), a class E stage consists of an output transistor, M₁, a grounded capacitor, C₁, and a series network C₂ and L₁ [10]. Note that C₁ includes the junction capacitance of M₁ and the parasitic capacitance of the RFC. The values of C₁, C₂, L₁, and R_L are chosen such that V_X satisfies three conditions: (1) as the switch turns off V_X remains low long enough for the current to drop to zero, i.e., V_X and I_D₁ have nonoverlapping waveforms [Fig. 12.27(b)]; (2) V_X reaches zero just before the switch turns on [Fig. 12.27(c)]; and (3) dV_X/dt is also near zero when the switch turns on. We examine these conditions to understand the circuit’s properties.

Figure 12.27 (a) Class E stage, (b) condition to ensure minimal overlap between drain current and voltage, (c) condition to ensure low sensitivity to timing errors.

The first condition, guaranteed by C₁, resolves the issue of finite fall time at the gate of M₁. Without C₁, V_X would rise as V_in dropped, allowing M₁ to dissipate substantial power.

The second condition ensures that the V_DS and I_D of the switching device do not overlap in the vicinity of the turn-on point, thus minimizing the power loss in the transistor even with finite input and output transition times.

The third condition lowers the sensitivity of the efficiency to violations of the second condition. That is, if device or supply variations introduce some overlap between the voltage and current waveforms, the efficiency degrades only slightly because dV_X/dt = 0 means V_X does not change significantly near the turn-off point.

The implementation of the second and third conditions is less straightforward. After the switch turns off, the load network operates as a damped second-order system (Fig. 12.28) [10] with initial conditions across C₁ and C₂ and in L₁. The time response depends on the Q of the network and appears as shown in Fig. 12.28 for underdamped, overdamped, and critically-damped conditions. We note that in the last case, V_X approaches zero volt with zero slope. Thus, if the switch begins to turn on at this time, the second and third conditions are met.

Figure 12.28 Class E matching network viewed as a damped network.

Modeling a class E stage as shown in Fig. 12.29(a), plot the circuit’s voltages and currents.

Figure 12.29 (a) Model of class E stage, (b) simplified circuit when transistor is on, (c) voltage and current waveforms, (d) simplified circuit when transistor is off.

Solution:

When M₁ turns on, it shorts node X to ground but carries little current because V_X is already near zero at this time (second condition described above) [Fig. 12.29(b)]. If R_on₁ is small, V_X remains near zero and L_D sustains a relatively constant voltage, thus carrying a current given by

(12.41)-(12.42)

In other words, one half cycle is dedicated to charging L_D with minimal drop across M₁ [Fig. 12.29(c)]. When M₁ turns off, the inductor current begins to flow through C₁ and the load [Fig. 12.29(d)], raising V_X. This voltage reaches a peak at t = t₁ and begins to fall thereafter, approaching zero with a zero slope at the end of the second half cycle (second and third conditions described above). The matching network attenuates higher harmonics of V_X, yielding a nearly sinusoidal output.

Class E stages are quite nonlinear and exhibit a trade-off between efficiency and output harmonic content. For low harmonics, the Q of the output network must be higher than that typically required by the second and third conditions. In most standards, the harmonics of the carrier must be sufficiently small because they fall into other communication bands. (Note that a low harmonic content does not necessarily mean that the PA itself is linear; the output transistor may still create spectral regrowth or amplitude compression.)

Another property of class E amplifiers is the large peak voltage that the switch sustains in the off state, approximately 3.56V_DD − 2.56V_S, where V_S is the minimum voltage across the transistor [10]. With V_DD = 1 V and V_S = 50 mV, the peak exceeds 3 V, raising serious device reliability or breakdown issues.

The design equations of class E stages are beyond the scope of this book. The reader is referred to [10] for details.

12.3.3 Class F Power Amplifiers

The idea of harmonic termination described in Section 12.3.1 can be extended to nonlinear amplifiers as well. If in the generic switching stage of Fig. 12.25 the load network provides a high termination impedance at the second or third harmonics, the voltage waveform across the switch exhibits sharper edges than a sinusoid, thereby reducing the power loss in the transistor. Such a circuit is called a class F stage [11].

Figure 12.30(a) shows an example of the class F topology. The tank consisting of L₁ and C₁ resonates at twice or three times the input frequency, approximating an open circuit. As depicted in Fig. 12.30(b), V_X approaches a rectangular waveform with the addition of the third harmonic.

Figure 12.30 Example of class F stage.

Explain why a class B stage does not lend itself to third-harmonic peaking.

Solution:

If the output transistor conducts for half of the cycle, the resulting half-wave rectified current contains no third harmonic. The Fourier coefficients of the third harmonic are given by

(12.43)-(12.45)

and

(12.46)-(12.48)

The above example suggests that third-harmonic peaking is viable only if the output transistor experiences “hard” switching, i.e., its output current resembles a rectangular wave. This in turn requires that the gate (or base) voltage be driven by relatively sharp edges.

If the drain current of the transistor is assumed to be a half-wave rectified sinusoid, it can be proved that the peak efficiency of class F amplifiers is equal to 88% for third-harmonic peaking [11].

12.4 Cascode Output Stages

Our study of PA stages in the previous sections reveals that to achieve a high efficiency, the output stage must produce a waveform that swings above V_DD. For example, in class A and B efficiency calculations, the drain waveform is assumed to have a peak-to-peak swing of nearly 2V_DD. However, if V_DD is chosen equal to the nominal supply voltage of the process, the output transistor experiences breakdown or substantial stress. One can choose V_DD equal to half of the maximum tolerable voltage of the transistor, but with two penalties: (a) the lower headroom limits the linear voltage range of the circuit, and (b) the proportionally higher output current (for a given output power) leads to a greater loss in the output matching network, reducing the efficiency.

A cascode output stage somewhat relaxes the above constraints. As shown in Fig. 12.31(a), the cascode device “shields” the input transistor as V_X rises, keeping the drain-source voltage of M₁ less than V_b − V_TH₂ (why?). Depicted in Fig. 12.31(b) are the typical waveforms: V_X swings by about 2V_DD and V_Y by about V_b − V_TH (if the minimum drain-source voltages are small).

Figure 12.31 (a) Cascode PA and (b) its waveforms.

Determine the maximum terminal-to-terminal voltage differences of M₁ and M₂ in Fig. 12.31(a). Assume V_in has a peak amplitude of V₀ and a dc level of V_m, and V_X has a peak amplitude of V_p (and a dc level of V_DD).

Solution:

Transistor M₁ experiences maximum V_DS as V_in falls to V_m − V₀. If M₁ nearly turns off, then V_DS₁ ≈ V_b − V_TH₂, V_GS₁ = V_m − V₀, and V_DG₁ = V_b − V_TH₂ − (V_m − V₀). For the same input level, the drain voltage of M₂ reaches its maximum of V_DD + V_p, creating

(12.49)

and

(12.50)

Also, the drain-bulk voltage of M₂ reaches V_DD + V_p.

In the cascode topology of Fig. 12.31(a), the values of V_b and V_p must be chosen so as to guarantee V_DS₂ and V_DG₂ remain below V_DD at all times. (The drain-bulk voltage is typically allowed to reach 2V_DD or even higher with no reliability concerns.) From Eqs. (12.49) and (12.50), we can write respectively,

(12.51)-(12.52)

The former is a stronger condition and reduces to

(12.53)

For example, if V_b = V_DD, then V_p ≤ V_DD − V_TH₂; i.e., the peak-to-peak swing at X is limited to 2V_DD − 2V_TH₂. With body effect, V_TH₂ may reach 0.5 V in 90-nm and 65-nm technologies, yielding a total swing of only 1 V_pp, about the same as that of a noncascoded common-source stage! We therefore observe that the cascode topology offers only a marginal increase in the maximum allowable output swing at low supply voltages.⁴ Since a cascode topology with a supply voltage of V_DD provides an output swing approximately equal to that of a common-source stage with a supply voltage of V_DD/2, we expect the former to exhibit an efficiency about half that of the latter, i.e., about 25% in class A operation.

Let us now compare the cascode and CS stages in terms of their linearity. For the stages shown in Fig. 12.32, we seek the maximum output voltage swing that places M₁ at the edge of saturation. From Fig. 12.32(a),

(12.54)

and from Fig. 12.32(b),

(12.55)

Figure 12.32 (a) Cascode and (b) CS stages for linearity analysis.

It follows that

(12.56)

Thus, the CS stage remains linear across a wider output voltage range than the cascode circuit does.

The foregoing study suggests that, at low supply voltages, cascode output stages offer only a slight voltage swing advantage over their CS counterparts, but at the cost of efficiency and linearity. Nonetheless, by virtue of their high reverse isolation (a small |S₁₂|), cascode stages experience less feedback, thus proving more stable. As studied in Chapter 5 for low-noise amplifiers, a simple CS stage may suffer from a negative input resistance.

Consider the two-stage PA shown in Fig. 12.33(a). If the output stage exhibits a negative input resistance, how can the cascade be designed to remain stable?

Figure 12.33 (a) Cascade of two CS stages, (b) simplified model of (a), (c) representation of first stage by a resonant impedance.

Solution:

Drawing the Thevenin equivalent of the first stage as shown in Fig. 12.33(b), we observe that instability can be avoided if

(12.57)

so that V_Thev does not absorb energy from the circuit. If Z_out is modeled by a parallel tank [Fig. 12.33(c)], then

(12.58)

Thus, we require that

(12.59)

Of course, this condition must hold at all frequencies and for a certain range of R_in. For example, if the user of a cell phone wraps his/her hand around the antenna, R_L and hence R_in change.

We deal with the transistor-level design of a 6-GHz cascode PA in Chapter 13. The efficiency of the circuit reaches 30% around compression but falls to 5% with enough back-off to satisfy 11a requirements.

12.5 Large-Signal Impedance Matching

In the development of PAs thus far, we have assumed that the output matching network simply transforms R_L to a lower value. This simplistic model of the output network is shown in Fig. 12.34(a), where M₁ operates as an ideal current source and L₁ resonates with C_DB₁, allowing the transistor’s RF current to flow into R_L. In practice, however, the situation is more complex: the transistor exhibits an output resistance, r_O₁, and both r_O₁ and C_DB₁ vary significantly with V_DS₁ [Fig. 12.34(b)]. (Recall that for a high efficiency, V_DS₁ goes from near zero to 2V_DD and I_D₁ from near zero to a large value, creating considerable change in r_O₁ and C_DB₁.) Thus, a nonlinear complex output impedance must be matched to a linear load.

Figure 12.34 CS stage with (a) linear drain capacitance and (b) nonlinear drain capacitance and resistance.

Before dealing with the task of nonlinear impedance matching, let us first consider a simple case where the transistor is modeled as an ideal current source having a linear resistive output impedance [Fig. 12.35(a)]. For a given r_O₁, how do we choose R_L? Let us compute the power delivered by M₁ to R_L, P_RL, and that consumed by the transistor’s output resistance, P_ro₁. We have

Figure 12.35 Impedance matching with (a) simple transistor model, (b) C_DB included, (c) an LC network.

(12.60)

where I_p denotes the peak amplitude of the transistor’s RF current. Similarly,

(12.61)

For maximum power transfer, R_L is chosen equal to r_O₁, yielding P_RL = P_ro₁. That is, the transistor consumes half of the power, dropping the efficiency by a factor of two. On the other hand, since

(12.62)

we recognize that reducing R_L minimizes the relative power consumed by the transistor, allowing the efficiency to approach its theoretical maximum (e.g., 50% in class A stages). The key point here is that maximum power transfer does not correspond to maximum efficiency.⁵ In PA design, therefore, R_L is transformed to a value much less than r_O₁.⁶

In the next step, suppose, as shown in Fig. 12.35(b), the transistor output capacitance is also included. Note that M₁ may be several millimeters wide for an output power level of, say, 100 mW, exhibiting large capacitances. The matching network must now provide a reactive component to cancel the effect of C_DB₁. Figure 12.35(c) illustrates a simple example where L₁ cancels C_DB₁, and C₁ and L₂ transform R_L to a lower value.

Now consider the general case of a nonlinear complex output impedance. A small-signal approximation of the impedance in the midrange of the output voltage and current can be used to obtain rough values for the matching network components, but modifying these values for maximum large-signal efficiency requires a great deal of trial and error, especially if the package parasitics must be taken into account. In practice, a more systematic approach called the “load-pull measurement” is employed.

Load-Pull Measurement

Let us envision how the matching network interposed between the output transistor and the load must be designed. As conceptually shown in Fig. 12.36(a), a lossless variable passive network (a “tuner”) can present to M₁ a complex load impedance, Z₁, whose imaginary and real parts are controlled externally. We vary Z₁ such that the power delivered to R_L remains constant and equal to P₁, thus obtaining the contour depicted in Fig. 12.36(b). A low P₁ corresponds to a broader range of Re{Z₁} and Im{Z₁} and hence a wider contour. Next, we seek those values of Z₁ that yield a higher output power, P₂, arriving at another (perhaps tighter) contour. These “load-pull” measurements can be repeated for increasing power levels, eventually arriving at an optimum impedance, Z_opt, for the maximum output power. Note that the power contours also indicate the sensitivity of P_out to errors in the choice of Z₁.

Figure 12.36 (a) Load-pull test, (b) contours used in load-pull test, (c) computation of input and output matching impedances.

In the above arrangement, the input impedance of the transistor, Z_in, has some dependence on Z₁ due to the gate-drain capacitance of M₁. Thus, the power delivered to the transistor varies with Z₁, leading to a variable power gain. This effect can be avoided by inserting another tuner between the signal generator and the gate and adjusting it to obtain conjugate matching at the input for each value of Z₁ [Fig. 12.36(c)]. In a multistage PA, however, this adjustment may be unnecessary: after Z₁ reaches the optimum, Z_in assumes a certain value, and the preceding stage is simply designed to drive Z_in.

The load-pull technique has been widely used in PA design, but it requires an automated setup with precise and stable tuners. This method has three drawbacks. First, the measured results for one device size cannot be directly applied to a different size. Second, the contours and impedance levels are measured at a single frequency, failing to predict the behavior (e.g., stability) at other frequencies. Third, since the optimum choice of Z₁ in Fig. 12.36(a) does not necessarily provide peaking at higher harmonics, this technique cannot predict the efficiency and output power in the presence of harmonic termination. For these reasons, high-performance PA design using load-pull data still entails some trial and error.

12.6 Basic Linearization Techniques

Recall from Section 12.3 that PAs designed for a high efficiency suffer from considerable nonlinearity. For relatively low output power levels, e.g., less than + 10 dBm (10 mW), we may simply back off from the PA’s 1-dB compression point until the linearity reaches an acceptable value. The efficiency then falls significantly (e.g., to 10% for OFDM with 16QAM), but the absolute power drawn from the supply may still be reasonable (e.g., 100 mW). For higher output power levels, however, a low efficiency translates to a very large power consumption.

A great deal of effort has been expended on linearization techniques that offer a higher overall efficiency than back-off from the compression point does. As we will see, such techniques can be categorized under two groups: those that require some linearity in the PA core, and those that, in principle, can operate with arbitrarily nonlinear stages. We expect the latter to achieve a higher efficiency.

Another point observed in the following study is that linear PAs are rarely realized as negative-feedback amplifiers. This is out of concern for stability, especially if the package parasitics and their variability must be taken into account.

In this section, we present four techniques: feedforward, Cartesian feedback, pre-distortion, and envelope feedback. Two other techniques, namely, polar modulation and outphasing have become popular enough in modern RF design that they merit their own sections and will be studied in Sections 12.7 and 12.8, respectively.

12.6.1 Feedforward

A nonlinear PA generates an output voltage waveform that can be viewed as the sum of a linear replica of the desired signal and an “error” signal. The “feedforward” architecture computes this error and, with proper scaling, subtracts it from the output waveform [12–14]. Shown in Fig. 12.37(a) is a simple example, where the output of the main PA, V_M, is scaled by a factor of 1/A_v, generating V_N. The input is subtracted from V_N and the result is scaled by A_v and subtracted from V_M. If V_M = A_vV_in + V_D, where V_D represents the distortion content, then

(12.63)

yielding V_p = V_D/A_v, V_Q = V_D, and hence V_out = A_vV_in.

Figure 12.37 Feedforward linearization.

In practice the two amplifiers in Fig. 12.37(a) exhibit substantial phase shift at high frequencies, causing imperfect cancellation of V_D. Thus, as shown in Fig. 12.37(b), a delay stage, Δ₁, is inserted to compensate for the phase shift of the main PA, and another, Δ₂, for the phase shift of the error amplifier. The two paths leading from V_in to the first subtractor are sometimes called the “signal cancellation loop” and the two from M and P to the second subtractor, the “error cancellation loop.”

Avoiding feedback, the feedforward topology is inherently stable if the two constituent amplifiers remain stable, the principal advantage of this architecture. Nonetheless, feed-forward suffers from several shortcomings that have made its use in integrated PA design difficult. First, the analog delay elements introduce loss if they are passive or distortion if they are active, a particularly serious issue for Δ₂ as it carries a full-swing signal. Second, the loss of the output subtractor (e.g., a transformer) degrades the efficiency. For example, a loss of 1 dB lowers the efficiency by about 22%.

A student surmises that the output subtraction need not introduce loss if it is performed in the current domain, e.g., as shown in Fig. 12.38. Explain the feasibility of this idea.

Figure 12.38 Addition of signals in current domain.

Solution:

Since the main PA in Fig. 12.37(b) is followed by a delay line and since performing delay in the current domain is difficult, the subtraction must inevitably occur in the voltage domain—and by means of passive devices. Thus, the idea is not practical. Other issues related to this concept are discussed later.

Third, the linearity improvement depends on the gain and phase matching of the signals sensed by each subtractor. The linearity can be measured by a two-tone test. It can be shown [12] that if the two paths from V_in in Fig. 12.37(b) to the inputs of the first subtractor exhibit a phase mismatch of Δφ and a relative gain mismatch of ΔA/A, then the suppression of the magnitude of the intermodulation products in V_out is given by

(12.64)

For example, if ΔA/A = 5% and Δφ = 5°, then E = 0.102, i.e., feedforward lowers the IM products by approximately 20 dB. The phase and gain mismatches in the error correction loop further degrade the performance.

Considering the system of Fig. 12.37(b) as a “core” PA, apply another level of feedforward to further improve the linearity.

Solution:

Figure 12.39 shows the “nested” feedforward architecture [15]. The core PA output is scaled by , and a delayed replica of the main input is subtracted from it. The error is scaled by and summed with the delayed replica of the core PA output.

Figure 12.39 Nested feedforward systems.

While various calibration schemes can be conceived to deal with path mismatches, the loss of the output subtractor (and Δ₂) are the principal drawbacks of this architecture.

Suppose the main PA stage in Fig. 12.37(a) is completely nonlinear, i.e., its output transistor operates as an ideal switch. Study the effect of feedforward on the PA.

Solution:

With the output transistor acting as an ideal switch, the PA removes the envelope of the signal, retaining only the phase modulation (Fig. 12.40). If V_in(t) = V_env(t) cos[ω₀t + φ(t)],

Figure 12.40 Simplified feedforward system.

then

(12.65)

where V₀ is constant. For such a nonlinear stage, it is difficult to define the voltage gain, A_v, because the output has little resemblance to the input. Nonetheless, let us proceed with feedforward correction: we divide V_M by A_v, obtaining

(12.66)-(12.67)

It follows that

(12.68)-(12.70)

The output can therefore faithfully track the input with a voltage gain of A_v. Interestingly, the final output is independent of V₀.

12.6.2 Cartesian Feedback

As mentioned previously, stability issues make it difficult to apply high-frequency negative feedback around power amplifiers. However, if most of the loop gain necessary for linearization is obtained at low frequencies, the excess phase shift may be kept small and the system stable. In a transmitter, this is possible because the waveform processed by the PA in fact originates from upconverting a baseband signal. Thus, if the PA output is downconverted and compared with the baseband signal, an error term proportional to the nonlinearity of the transmitter chain can be created. Figure 12.41(a) depicts a simple example, where the TX consists of only one upconversion mixer and a PA. The loop attempts to make V_PA an accurate replica of V_in, but at a different carrier frequency. Since the total phase shift through the mixers and the PA at high frequencies is significant, the phase, θ, is added to one of the LO signals so as to ensure stability.

Figure 12.41 (a) PA with translational feedback loop, (b) Cartesian feedback.

Note that the approach of Fig. 12.41(a) corrects for the nonlinearity of the entire TX chain, namely, A₁, MX₁, and the PA. Of course, since MX₂ must be sufficiently linear, it is typically preceded by an attenuator.

Most modulation schemes require quadrature upconversion—and hence quadrature downconversion in the above scheme. Figure 12.41(b) shows the resulting topology. In this form, the technique is called “Cartesian feedback” because both I and Q components participate in the loop.

It is instructive to compare the feedforward and Cartesian feedback topologies. The latter avoids the output subtractor and is much less sensitive to path mismatches. However, Cartesian feedback requires some linearity in the PA: if a completely nonlinear PA removes the envelope, no amount of feedback can restore it.

Cartesian feedback faces a severe issue: the choice of the stabilizing LO phase shift [e.g., θ in Fig. 12.41(a)] is not straightforward because the loop phase shift varies with process and temperature. For example, while roaming toward or away from the base station, a cell phone adjusts the PA output level and, inevitably, the chip temperature, making it difficult to select a single value for θ.

12.6.3 Predistortion

If the PA nonlinear characteristics are known, it is possible to “predistort” the input waveform in such a manner that, after experiencing the PA nonlinearity, it resembles the ideal waveform. For example, for a PA static characteristic expressed as y = g(x), predistortion subjects the input to a characteristic given by y = g⁻¹(x) [Fig. 12.42(a)]. Specifically, if g(x) is compressive, predistortion must expand the signal amplitude.

Figure 12.42 (a) Basic predistortion concept, (b) realization in baseband.

Predistortion suffers from three drawbacks. First, the performance degrades if the PA nonlinearity varies with process, temperature, and load impedance while the predistorter does not track these changes. For example, if the PA becomes more compressive, then the predistorter must become more expansive, a difficult task. Second, the PA cannot be arbitrarily nonlinear as no amount of predistortion can correct for an abrupt nonlinearity. Third, variations in the antenna impedance (e.g., how a user holds a cell phone) somewhat affect the PA nonlinearity, but predistortion provides a fixed correction.

Predistortion can also be realized in the digital domain to allow a more accurate cancellation. Illustrated in Fig. 12.42(b), the idea is to alter the baseband signal (e.g., expand its amplitude) such that it returns to its ideal waveform upon experiencing the TX chain nonlinearity. Of course, the above two issues still persist here.

A student surmises that the performance of the topology shown in Fig. 12.42(a) can be improved if the predistorter is continuously informed of the PA nonlinearity, i.e., if the PA output is fed back to the predistorter. Explain the pros and cons of this idea.

Solution:

Feedback around these topologies in fact leads to architectures resembling those shown in Fig. 12.41. Depicted in Fig. 12.43 is an example, where the feedback signal produced by the low-frequency ADCs “adjusts” the predistortion.

Figure 12.43 Predistortion with feedback.

12.6.4 Envelope Feedback

In order to reduce envelope nonlinearity (i.e., AM/AM conversion) of PAs, it is possible to apply negative feedback only to the envelope of the signal. Illustrated in Fig. 12.44, the idea is to attenuate the output by a factor of α, detect the envelope of the result, compare it with the input envelope, and adjust the gain of the signal path accordingly. With a high loop gain, the signals at A and B are nearly identical, thus forcing V_out to track V_in with a gain factor of 1/α.

Figure 12.44 PA with envelope feedback.

How does the distortion of the envelope detectors affect the performance of the above system?

Solution:

If the two detectors remain identical, their distortion does not affect the performance because the feedback loop still yields V_A ≈ V_B and hence V_D ≈ V_in. This property proves greatly helpful here as typical envelope detectors suffer from nonlinearity.

Envelope Detection

The reader may wonder how an envelope detector can be designed. As shown in Fig. 12.45(a), a mixer can raise the input to the power of two, yielding from V_in(t) = V_env(t) cos[ω₀t + φ(t)] the following output

(12.71)-(12.72)

where β denotes the mixer conversion gain. Thus, the low-frequency term at the output is proportional to . Since the nonlinearity of the envelope detector in the above scheme is not critical, this topology appears a plausible choice.

Figure 12.45 (a) Mixer as envelope detector, (b) source follower as envelope detector, (c) limiter and mixer as envelope detector, (d) realization of (c).

Figure 12.45(b) shows an envelope detector circuit based on “peak detection.” Here, the slew rate given by I₁/C₁ is chosen much much less than the carrier slew rate so that the output tracks the envelope but not the carrier. As V_in rises above V_out + V_TH, V_out tends to track it, but as V_in falls, M₁ turns off and V_out remains relatively constant because I₁ discharges C₁ very slowly. The dimensions of M₁ and the values of I₁ and C₁ must be chosen carefully here: if M₁ is not strong enough or C₁ is excessively large, then V_out fails to track the envelope itself.

A true envelope detector can be realized if the topology of Fig. 12.45(a) is modified as shown in Fig. 12.45(c). Called a “synchronous AM detector,” the circuit employs a limiter in either of the signal paths, thus removing the envelope variation in that path. Denoting the signal at B by V₀ cos[ω₀t + φ(t)], we have

(12.73)-(12.74)

The low-pass filter therefore produces the true envelope. Figure 12.45(d) depicts the transistor-level implementation. Here, the limiter transistors must have a small overdrive voltage so that they remove the amplitude variation. In practice, the limiter may require two or more cascaded differential pairs so as to remove envelope variations in one path leading to the mixer.

12.7 Polar Modulation

A linearization originally called “envelope elimination and restoration” (EER) [16] and more recently known as “polar modulation” [17] has become popular in the past ten years. This technique offers two key advantages that allow a high efficiency: (1) it can operate with an arbitrarily nonlinear output stage,⁷ and (2) it does not require an output combiner (e.g., the subtractor in the feedforward topology).

12.7.1 Basic Idea

Let us begin with the original EER method. As mentioned in Chapter 3, any band-pass signal can be represented as V_in(t) = V_env(t) cos[ω₀t + φ(t)], where V_env(t) and φ(t) denote the envelope and phase components, respectively. We may then postulate that we can decompose V_in(t) into an envelope signal and a phase signal, amplify each separately, and combine the results at the end. Figure 12.46 illustrates the concept. The input signal drives both an envelope detector and a limiting stage, thus generating the envelope, V_env(t), and the phase-modulated component, V_phase(t) = V₀ cos[ω₀t + φ(t)]. Note that the latter still contains the carrier—rather than only φ(t)—even though it is called the “phase” signal. These signals are subsequently amplified and “combined” in the PA, reproducing the desired waveform. Since the output stage amplifies a constant-envelope signal, V_phase(t), it can be nonlinear and hence efficient. This approach is also called polar modulation because it processes the signal in the form of a magnitude (envelope) component and a phase component.

Figure 12.46 Envelope elimination and restoration.

How should the amplified versions of V_env(t) and V_phase(t) be combined in the output stage? Denoting those versions by A₀V_env(t) and A₀V_phase(t), respectively, we observe that the desired output assumes the form A₀V_env(t) cos[ω₀t + φ(t)], i.e., the amplitude of A₀V_phase(t) must be modulated by A₀V_env(t). It follows that the combining operation must entail multiplication or mixing rather than linear addition.

A student decides that a simple mixer serves the purpose of combining and constructs the system shown in Fig. 12.47. Is this a good idea?

Figure 12.47 Use of mixer to combine envelope and phase signals.

Solution:

No, it is not. Here, it is the mixer—rather than the PA core—that must deliver a high power, a very difficult task.

The combining operation is typically performed by applying the envelope signal to the supply voltage, V_DD, of the output stage—with the assumption that the output voltage swing is a function of V_DD. To understand this point, let us begin with the simple circuit depicted in Fig. 12.48(a), where S₁ is driven by the phase signal. When S₁ turns on, V_out jumps to near zero and subsequently rises exponentially toward V_DD [Fig. 12.48(b)]. When S₁ turns off, the instantaneous change in the inductor current yields an impulse in the output voltage. The output voltage swing is clearly a function of V_DD. Note the average areas under the exponential section and the impulse must be equal so that the output average remains equal to V_DD.

Figure 12.48 (a) Simple model of output stage, (b) output waveform, (c) stage with capacitances and load resistance, (d) resulting output waveform.

Now consider the more realistic circuit shown in Fig. 12.48(c). In this case, the output waveform somewhat resembles a sinusoid [Fig. 12.48(d)], but its amplitude is still a function of V_DD.

Under what condition is the PA output swing not a function of V_DD?

Solution:

If the output transistor acts as a voltage-dependent current source (e.g., a MOSFET operating in saturation), then the output swing is only a weak function of V_DD. In other words, all PA classes that employ the output transistor as a current source fall in this category and are not suited to EER.

The foregoing observations lead to the conceptual combining circuit shown in Fig. 12.49(a), where the envelope signal directly drives the supply node of the PA stage. The large current flowing through this stage requires a buffer in this path, but efficiency considerations demand minimal voltage headroom consumption by the buffer. As an example, the arrangement in Fig. 12.49(b) incorporates a voltage-dependent resistor, M₂, to modulate V_DD,PA, in proportion to A₀V_env(t). For an average current of I₀ through L₁ and an average voltage drop of V₀ across the drain-source resistance of M₂, this device dissipates a power of I₀V₀, lowering the efficiency. Thus, M₂ is typically a very wide transistor.

Figure 12.49 (a) Partial realization of EER, (b) output stage with envelope-controlled load, (c) local envelope feedback.

Does the circuit of Fig. 12.49(b) guarantee that V_DD,PA tracks A₀V_env(t) faithfully? No, it does not: in this “open-loop” control, V_DD,PA is a function of various device parameters. This issue becomes more serious if the PA must provide a variable output level because changing the current of the output stage also alters V_DD,PA. We may modify the stage to the “closed-loop” control shown in Fig. 12.49(c), where amplifier A₁ introduces a high loop gain so that V_DD,PA ≈ A₀V_env(t). Of course, A₁ must accommodate an input common-mode level near V_DD.

12.7.2 Polar Modulation Issues

Polar modulation entails a number of issues. First, the mismatch between the delays of the envelope and phase paths corrupts the signal in Fig. 12.46. To formulate this effect, we assume a delay mismatch of ΔT and express the output as

(12.75)

For a small ΔT, V_env(t − ΔT) can be approximated by the first two terms in its Taylor series:

(12.76)

It follows that

(12.77)

The corruption is therefore proportional to the derivative of the envelope signal, leading to substantial spectral regrowth because the spectrum of V_env(t) is equivalently multiplied by ω². For example, in an EDGE system, a delay mismatch of 40 ns allows only 5 dB of margin between the output spectrum and the required spectral mask [18].

The problem of delay mismatch is a serious one because the two paths in Fig. 12.46 employ different types of circuits operating at vastly different frequencies: the envelope path contains an envelope detector and a low-frequency buffer, whereas the phase path includes a limiter and an output stage.

The second issue relates to the linearity of the envelope detector. Unlike the feedback topology of Fig. 12.44, the polar TX in Fig. 12.46 relies on precise reconstruction of V_env(t) by the envelope detector. As shown in Problem 12.6, this circuit’s nonlinearity produces spectral regrowth.

The third issue concerns the operation of limiters at high frequencies. In general, a nonlinear circuit having a finite bandwidth introduces AM/PM conversion, i.e., exhibits a phase shift that depends on the input amplitude. For example, consider the differential pair shown in Fig. 12.50(a), where the bandwidth is defined by the output pole, ω_p = 1/(R₁C₁). If the input is a small sinusoidal signal at ω₀, then the differential output current is also a sinusoid, experiencing a phase shift of

(12.78)

as it is converted to voltage. For ω₀ ω_p,

(12.79)

Figure 12.50 Limiting stage with (a) small and (b) large input swings.

Now, if the circuit senses a large input sinusoid [Fig. 12.50(b)] such that M₁ and M₂ produce nearly rectangular drain current waveforms, then the delay between the input and output is approximately equal to⁸

(12.80)

Expressing this result in radians, we have

(12.81)

Comparison of Eqs. (12.79) and (12.81) reveals that the phase shift decreases as the input amplitude increases. Thus, the limiter in Fig. 12.46 may corrupt the phase signal by the large excursions in the envelope.

The fourth issue stems from the variation of the output node capacitance (C_DB) in Fig. 12.49(c) by the envelope signal. As V_DD,PA swings up and down to track A₀V_env(t), C_DB varies and so does the phase shift from the gate of M₁ to its drain, φ₀ (Fig. 12.51). That is, the phase signal is corrupted by the envelope signal. This effect can be quantified as follows. We recognize that the variation of C_DB alters the resonance frequency, ω₁, at the output node. We can therefore express the dependence of φ₀ upon the drain voltage as a straight line having a slope of⁹

(12.82)

Figure 12.51 AM/PM conversion due to output capacitance nonlinearity.

The first derivative on the right-hand side can readily be found, e.g., from

(12.83)

where V_B denotes the junction built-in potential and m is typically around 0.4. The second derivative, dω/dC_DB, is obtained from as

(12.84)-(12.85)

Finally, dφ₀/dω is computed from the quality factor, Q, of the output network (Chapter 8); that is,

(12.86)

and hence

(12.87)

It follows that

(12.88)

To the first order,

(12.89)

As mentioned earlier, another issue in polar modulation is the efficiency (and voltage headroom) reduction due to the envelope buffer [M₂ in Fig. 12.49(c)]. We will see below that, among the issues outlined above, only the last one defies design techniques and becomes the bottleneck at low supply voltages.

12.7.3 Improved Polar Modulation

The advent of RF IC technology has also improved polar transmitters considerably. In this section, we study a number of techniques that address the issues described in the previous section. The key principle here is to expand the design horizon to include the entire transmitter chain rather than merely the RF power amplifier.

In the conceptual approach depicted in Fig. 12.46, we attempted to decompose the RF signal into envelope and phase components, thus facing limiter’s AM/PM conversion. Let us instead perform this decomposition in the baseband. For an RF waveform V_env(t) cos[ω₀t + φ(t)], the quadrature baseband signals are given by

(12.90)

(12.91)

Thus,

(12.92)-(12.93)

In other words, the digital baseband processor can generate V_env(t) and φ(t) either directly or from the I and Q components, obviating the need for decomposition in the RF domain.

While V_env(t) can now be applied to modulate the PA power supply, φ(t) does not easily lend itself to upconversion to radio frequencies. The following example illustrates this point.

In our study of frequency-modulated or phase-modulated transmitters in Chapter 3, we encountered two architectures, namely, direct VCO modulation and quadrature upconversion. Can these architectures be utilized in a polar modulation system?

Solution:

First, consider applying the phase information to the control line of a VCO. The integration performed by the VCO requires that φ(t) be first differentiated [Fig. 12.52(a)]. We have

(12.94)-(12.95)

Figure 12.52 Polar modulation using baseband signal separation and (a) a VCO, or (b) a quadrature upconverter.

However, as explained in Chapter 3, since both the full-scale swing of dφ/dt (in the analog domain) and K_VCO are poorly-defined, so is the bandwidth of V_phase(t). Also, the free-running operation of the VCO during modulation may shift the carrier frequency from its desired value.

Now, consider a quadrature modulator, as stipulated in Chapter 3 for GMSK. In this case, V_phase(t) is expressed as

(12.96)

i.e., so that V₀ cos φ and V₀ sin φ are produced by the baseband and upconverted by quadrature mixers [Fig. 12.52(b)]. However, as mentioned in Chapter 4, this approach may still introduce significant noise in the receive band because the noise of the mixers is upconverted and amplified by the PA.

In addition to direct VCO modulation and quadrature upconversion, we studied in Chapter 9 a number of techniques leading to the offset-PLL TX. For example, we contemplated a PLL as a means of upconversion of the phase signal. Figure 12.53(a) depicts an architecture combining that idea with polar modulation. In this case, the phase signal produced by the baseband processor is located at a finite carrier frequency, ω_IF, and its phase excursion is scaled down by a factor of N. The PLL thus generates an output given by

(12.97)

where Nω_IF is chosen equal to the desired carrier frequency. The value of ω_IF must remain between two bounds: (1) it must be low enough to avoid imposing severe speed-power trade-offs on the baseband DAC, and (2) it must be high enough to avoid aliasing [Fig. 12.53(b)].

Figure 12.53 Polar modulation using a PLL in phase path, (b) spectrum of phase signal.

It is possible to combine an offset-PLL TX with polar modulation [19]. Illustrated in Fig. 12.54, the idea is to perform quadrature upconversion to a certain IF, extract the envelope component, and apply it to the PA. The VCO output is downconverted, serving as the LO waveform for the quadrature modulator. Note that the IF signal at node A carries little phase modulation because the PLL feedback forces the phase at A to track that of f_REF (an unmodulated reference). With proper choice of the PLL bandwidth, the output noise in the receive band is determined primarily by the VCO design.

Figure 12.54 Polar modulation with phase feedback.

How can the architecture of Fig. 12.54 be modified so as to avoid an envelope detector?

Solution:

If the quadrature upconverter senses only the baseband phase information [as in Fig. 12.52(b)], then the envelope can also come from the baseband. Figure 12.55 shows such an arrangement, where the envelope component is directly produced by the baseband processor.

Figure 12.55 Polar modulation without envelope detection.

The polar modulation architectures studied above still fail to address two issues, namely, poor definition of the PA output envelope and the corruption due to the PA’s AM/PM conversion (e.g., due to the output capacitance nonlinearity). We must therefore apply feedback to sense and correct these effects. As shown in Fig. 12.49(c), the envelope can be controlled precisely by means of a feedback buffer driving the supply rail of the PA. Alternatively, as in the envelope feedback architecture of Fig. 12.44, the output envelope can be compared with the input envelope. Figure 12.56 depicts the resulting arrangement. The PA output voltage swing is scaled by a factor of α, applied to an envelope detector, and compared with the IF envelope. The feedback loop thus forces a faithful (scaled) replica of the IF envelope at the PA output. The envelope detectors can be realized as shown in Figs. 12.45(c) and (d).

Figure 12.56 Polar modulation with envelope feedback.

In order to correct the PA’s AM/PM conversion, the PA output phase must appear within the PLL, i.e., the PLL feedback path must sense the PA output rather than the VCO output. Illustrated in Fig. 12.57, such an architecture impresses the baseband phase excursions on the PA output by virtue of the high loop gain of the PLL. In other words, if the PA introduces AM/PM conversion, the PLL still guarantees that the phase at X tracks the baseband phase modulation. The two feedback loops present in this architecture can interact and cause instability, requiring careful choice of their bandwidths.

Figure 12.57 Polar modulation with phase and envelope feedback.

Identify the drawbacks of the architecture shown in Fig. 12.57.

Solution:

A critical issue here relates to the need for power control. Since the PA output level must be variable (by about 30 dB in GSM/EDGE and 60 dB in CDMA), the swing applied to mixer MX₁ may prove insufficient at the lower end of the power range, degrading the stability of the loop. For example, for a maximum peak-to-peak swing of 2 V at X and 30 dB of power range, the minimum swing sensed by MX₁ is about 66 mV_pp. To resolve this issue, a limiter must be interposed between the PA and MX₁, but we recall from Fig. 12.50 that limiters introduce considerable AM/PM conversion if their input senses a wide range of amplitudes. Of course, the limiter’s AM/PM conversion is not corrected by the loop.

Another drawback of the architecture is that the independent envelope and phase loops may exhibit substantially different delays, exacerbating the delay mismatch effect formulated by Eq. (12.77). In other words, the delay through the envelope detector, the error amplifier, and the supply modulation device in Fig. 12.57 may be arbitrarily different from that through the limiter, with no correction provided by the two loops.

Other Issues

The architecture of Fig. 12.57 or its variants [19] resolve some of the polar modulation issues identified in Section 12.7.2. However, several other challenges remain that merit attention.

First, the bandwidths of the envelope and phase signal paths must be chosen carefully. The key point here is that each of these components occupies a larger bandwidth than the overall composite modulated signal. As an example, Fig. 12.58 plots the spectra of the individual components and the composite signal along with the spectral mask for an EDGE system [18]. We note that the envelope spectrum exceeds the mask in a few regions and, more importantly, the phase spectrum consumes a much broader bandwidth. If the envelope and phase paths do not provide sufficient bandwidth, then the two components are not combined properly and the final PA output suffers from spectral regrowth, possibly violating the spectral mask. For example, if in an EDGE system the AM and PM path bandwidths are equal to 1 MHz and 3 MHz, respectively, then the output spectrum bears only a 2-dB margin with respect to the mask [18].

Figure 12.58 GSM/EDGE mask margins for a polar modulation system.

While the foregoing considerations call for a large bandwidth in the two paths, we must recall that the PLL specifically serves to reduce the noise in the receive band and, therefore, cannot have a large bandwidth. The trade-off between spectral regrowth and noise in the RX band in turn dictates tight control over the PLL bandwidth. Since the dependence of the charge pump current and K_VCO upon process and temperature leads to significant bandwidth variation, some means of bandwidth calibration is often necessary [18].

The second issue relates to the leakage of the PM signal to the output as an additive component. For example, suppose, as shown in Fig. 12.59, the VCO inductor couples a fraction of the PM signal to an inductor (or a pad) at the output of the PA [18].

Figure 12.59 Phase signal leakage path.

Noting the broad bandwidth of the phase signal in Fig. 12.58, we recognize that this leakage produces considerable spectral regrowth if it does not experience proper envelope modulation [18]. This phenomenon can be readily formulated as

(12.98)

where the second term represents the additive leakage.

The third issue concerns dc offsets in the envelope path [18]. If the envelope produced by the envelope detector has an offset, V_OS, then the PA output is given by

(12.99)

That is, the output contains a PM leakage component equal to A₀V_OS cos[ω₀t + φ(t)], which must be minimized so as to avoid spectral regrowth. For example, in an EDGE system, V_OS must remain below 0.2% of the peak of V_env(t) to allow sufficient margin for other errors [18]. Of course, if the output power must be variable, such a condition must hold even for the lowest output level, a difficult task.

12.8 Outphasing

12.8.1 Basic Idea

It is possible to avoid envelope variations in a PA by decomposing a variable-envelope signal into two constant-envelope waveforms. Called “outphasing” in [20] and “linear amplification with nonlinear components” (LINC) in [21], the idea is that a band-pass signal V_in(t) = V_env(t) cos[ω₀t + φ(t)] can be expressed as the sum of two phase-modulated components (Fig. 12.60),

(12.100)-(12.101)

where

(12.102)

(12.103)

and

(12.104)

Figure 12.60 Basic outphasing.

Thus, if V₁(t) and V₂(t) are generated from V_in(t), amplified by means of nonlinear stages, and subsequently added, the output contains the same envelope and phase information as does V_in(t).

Generation of V₁(t) and V₂(t) from V_in(t) requires substantial complexity, primarily because their phase must be modulated by θ(t), which itself is a nonlinear function of V_env(t). The use of nonlinear frequency-translating feedback loops has been proposed [21, 22], but loop stability issues limit the feasibility of these techniques. A more practical approach [23] considers V₁(t) and V₂(t) as

(12.105)

(12.106)

where the baseband components are given by

(12.107)

(12.108)

Since the nonlinear operation required to produce V_Q(t) can be performed in the baseband (e.g., using a look-up ROM), this method can simply employ quadrature upconversion to generate V₁(t) and V₂(t).

Construct a complete outphasing transmitter.

Solution:

From our study of GMSK modulation techniques in Chapter 3, we recall that the phase component, φ(t), should also be realized in the baseband rather than impressed on the LO. We therefore expand the original equations, (12.102) and (12.103), respectively, as follows

(12.109)

(12.110)

The TX is thus constructed as shown in Fig. 12.61.

Figure 12.61 Outphasing transmitter.

The outphasing architecture can operate with completely nonlinear PA stages, an important attribute similar to that of polar modulation. A critical advantage of outphasing is that it does not require supply modulation, saving the efficiency and headroom lost in the envelope buffer necessary in polar modulation. Unfortunately, the summation of the outputs in the outphasing technique entails power loss (as in the feedforward topology).

12.8.2 Outphasing Issues

In addition to the output summation problem, outphasing must deal with a number of other issues. First, the gain and phase mismatches between the two paths in Fig. 12.60 result in spectral regrowth at the output. Representing the two mismatches by ΔV and Δθ, respectively, we have

(12.111)

(12.112)

If Δθ 1 radian, then

(12.113)

The last two terms on the right-hand side create spectral growth because they exhibit a much larger bandwidth than the composite signal (the first term).

Identify the sources of mismatch in the architecture of Fig. 12.61.

Solution:

To avoid LO mismatch, the two quadrature upconverters must share the LO phases. The remaining sources include the mixers, the PAs, and the output summing mechanism.

The second issue concerns the required bandwidth of each path in Fig. 12.60. Since V₁(t) and V₂(t) experience large phase excursions, φ(t) ± θ(t) (when φ and θ “beat”), these two signals occupy a large bandwidth. Recall from the EDGE spectra in Fig. 12.58 that the bandwidth of a component of the form cos[ω₀t + φ(t)] is several times that of the composite signal. This is exacerbated in outphasing by the additional phase, θ(t).

A student attempts to reduce the excursions of θ(t) by selecting a scaling voltage of V_a > V₀ in Eq. (12.104):

(12.114)

Explain the effect on the overall TX. Assume the baseband waveforms are generated according to (12.109) and (12.110), i.e., with an amplitude of V₀/2.

Solution:

If θ(t) is scaled down while the amplitude of the baseband signals remains constant, the composite output amplitude falls. In Problem 12.9, we show that Eq. (12.113) must now be written as

(12.115)

It follows that the effect of mismatches becomes more pronounced as V_a increases and θ(t) is scaled down.

The third issue relates to the interaction between the two PAs through the output summing device. The signal traveling through one PA may affect that through the other, resulting in spectral regrowth and even corruption. To understand this point, let us consider the simple summation shown in Fig. 12.62(a). If M₁ and M₂ operate as ideal current sources, then one PA’s signal has little effect on the other’s.¹⁰ However, it is difficult to achieve a high efficiency while keeping M₁ and M₂ in saturation.

Figure 12.62 (a) Example of combining circuit, (b) simple model.

Now, suppose M₁ and M₂ enter the deep triode region and can be modeled as voltage-controlled switches [Fig. 12.62(b)]. In this case, the load seen by one PA is modulated by the other and hence varies with time, distorting the signal.

To formulate the interaction between the PAs, we consider the more common arrangement depicted in Fig. 12.63(a), where a transformer sums the outputs¹¹ and drives the load resistance. The output network can be simplified as shown in Fig. 12.63(b). We wish to determine the impedance seen by each PA with respect to ground. To this end, we must compute I_AB = (V_A − V_B)/R_L and then Z₁ = V_A/I_AB and Z₂ −V_B/I_AB. If each PA stage is modeled as an ideal voltage buffer with a unity gain, then V_A = V₁ and V_B = V₂, yielding

(12.116)-(12.118)

Figure 12.63 (a) Outphasing with a transformer, (b) equivalent circuit.

It follows that

(12.119)-(12.120)

We now assume θ is relatively constant with time, and transform this result to the frequency domain. Since the numerator and denominator of the fraction in the second term are 90° out of phase, they introduce a factor of −j in the equivalent impedance. Thus,

(12.121)

i.e., the equivalent impedance seen by PA₁ consists of a real part equal to R_L/2 and an imaginary part equal to (− cot θ)R_L/2.¹² Similarly,

(12.122)

It is often said that the reactive parts in Eqs. (12.121) and (12.122) correspond to capacitance and inductance, respectively. Is this statement accurate?

Solution:

Generally, it is not. Capacitive and inductive reactances must be proportional to frequency, whereas the second terms in Eqs. (12.121) and (12.122) are not. However, for a narrowband signal, a negative reactance can be viewed as a capacitance and a positive reactance as an inductance.

The dependence of Z₁ and Z₂ upon θ reveals that, if the PAs are not ideal voltage buffers, then the signal experiences a time-varying voltage division [Fig. 12.64(a)] and hence distortion. Recognized by Chireix [20], this effect can be alleviated if an additional reactance with opposite polarity is tied to each PA’s output so as to cancel the second term in Eqs. (12.121) or (12.122) [Fig. 12.64(b)]. Since a parallel reactance (admittance) is usually preferred, we first transform Z₁ and Z₂ to admittances. Inverting the left-hand side of (12.121) and multiplying the numerator and denominator by 1 + j cos θ, we have

(12.123)

Figure 12.64 (a) Time-varying voltage division in outphasing, (b) Chireix’s cancellation technique.

To cancel the second term,

(12.124)

and hence

(12.125)

Similarly,

(12.126)

To cancel the second term in (12.122),

(12.127)

and hence

(12.128)

With perfect cancellation, Z₁ = Z₂ = R_L/(2 sin² θ). Interestingly, L_A and C_B resonate at the carrier frequency because

(12.129)

The foregoing results are based on two assumptions: each PA can be approximated by a voltage source, and θ is relatively constant. The reader may view both suspiciously. After all, a heavily-switching PA stage exhibits an output impedance that swings between a small value (when the transistor is in the deep triode region) and a large value (when the transistor is off). Moreover, the envelope time variation translates to a time-varying θ. In other words, addition of a constant inductance and a constant capacitance to the output nodes provides only a rough compensation.

The reader may wonder if it is possible to construct a three-port power network that provides isolation between two of the ports, thereby avoiding the above interaction. It can be shown that such a network inevitably suffers from loss.

In order to improve the compensation, the inductance and capacitance can track the envelope variation [24]. However, since it is difficult to vary the inductance, we must seek an arrangement that lends itself to only capacitance variation. To this end, let us implement Chireix’s cancellation technique as shown in Fig. 12.65(a). Interestingly, L_A and C_B shift the resonance frequencies of the two output tanks in opposite directions. We therefore surmise that if only unequal capacitors are tied to A and B and varied in opposite directions, then cancellation may still occur. As depicted in Fig. 12.65(b), we select C_A and C_B as [24]

(12.130)

(12.131)

seeking the proper value of ΔC. The admittances of the tanks are given by

(12.132)

(12.133)

where L₁ = L₂ = L₀. Noting that, for a narrowband signal, 1/(jL₀ω) and jC₀ω cancel, we use Eqs. (12.123) and (12.132) to write the total admittance at A:

(12.134)-(12.135)

Figure 12.65 (a) Outphasing PA using Chireix’s technique, (b) addition of variable capacitances, (c) circuit with discrete capacitor arrays.

The reactive parts cancel if

(12.136)

Similarly, for node B:

(12.137)-(12.138)

yielding the same ΔC as in (12.136), a fortunate coincidence.

The above development indicates that if ΔC varies in proportion to sin 2θ, then the cancellation is more accurate, leaving a real part in the overall impedance equal to

(12.139)

Unfortunately, this component also varies with the envelope.¹³ This issue can be alleviated by adjusting the strength of each PA so as to maintain a relatively constant output power [24]. Figure 12.65(c) shows the result [24], where both the capacitors and the transistors can be tuned in discrete steps. Utilizing bond wires for inductors and an off-chip balun, the PA delivers an output of 13 dBm in the WCDMA mode with a drain efficiency of 27% [24].

12.9 Doherty Power Amplifier

The amplifier stages studied thus far incorporate a single output transistor, inevitably approaching saturation as the transistor enters the triode region (saturation region for bipolar devices). We therefore postulate that if an auxiliary transistor is introduced that provides gain only when the main transistor begins to compress, then the overall gain can remain relatively constant for higher input and output levels. Figure 12.66(a) illustrates this principle: the main amplifier remains linear for input swings up to about V₁, and the auxiliary amplifier contributes to the output power as the input exceeds V₁. The former operates in class A and the latter in class C.

Figure 12.66 (a) Input/output characteristics of a Doherty PA, (b) hypothetical implementation.

While simple and elegant, the above principle is not straightforward to implement: How exactly should the auxiliary amplifier be tied to the main amplifier? Figure 12.66(b) shows an example where the currents produced by the two branches are simply summed at the output node. However, if the voltage swing at X is large enough to drive M₁ into the triode region, then it is likely to drive M₂ into the triode region, too.

Recognizing that amplitude-modulated signals reach their peak values only occasionally and hence cause a low average efficiency, Doherty has introduced the above two-path principle and developed the PA topology shown in Fig. 12.67(a) [25]. He has called the main and auxiliary stages the “carrier” and “peaking” amplifiers, respectively. The carrier PA is followed by a transmission line of length equal to λ/4, where λ denotes the carrier wavelength. To match the delay through this line, another λ/4 T-line is inserted in series with the input of the peaking amplifier.

Figure 12.67 (a) Conceptual realization of Doherty PA, (b) equivalent output network.

In order to understand the operation of the Doherty PA, we construct the equivalent circuit shown in Fig. 12.67(b), where I₁ and I₂ represent the RF currents produced by the carrier and peaking stages, respectively. Our first objective is to determine the impedance Z₁. The voltage and current waveforms at a point x along a lossless transmission line are respectively given by

(12.140)

(12.141)

where the first term in each expression represents a wave propagating in the positive x direction and the second, a wave propagating in the negative x direction, β = 2π/λ, and Z₀ is the line’s characteristic impedance. Since I₂ is delayed with respect to I₁ by λ/4( = 90°), we write I₁ = I₀ cos ω₀t and I₂ = αI₀ cos(ω₀t−90°) = −αI₀ sin ω₀t, where α is a proportionality factor signifying the relative “strength” of the peaking stage. Equations (12.140) and (12.141) must now be satisfied at x = 0:

(12.142)

(12.143)

and at x = λ/4:

(12.144)

(12.145)

Writing a KCL at the output node, we have

(12.146)

and hence

(12.147)

It follows that

(12.148)

In the last step, we observe that Z₁ = −V₁/I₁, which from Eqs. (12.142) and (12.143) emerges as

(12.149)

Also, (12.143) yields V⁺ −V⁻ = −I₀Z₀ and hence Z₁ = −(V⁺ + V⁻)/I₀. Substituting these values in (12.148) gives

(12.150)

and

(12.151)

The key point here is that, as the peaking stage begins to amplify (α rises above zero), the load impedance seen by the main PA falls. This effect counteracts the increase of the main PA drain voltage swings that would be necessary for larger input levels, resulting in a relatively constant drain voltage swing beyond the transition point (Fig. 12.68). One can therefore choose V₁ such that the main PA operates in its linear region even for V_in > V₁.

Figure 12.68 Current and voltage variation in a Doherty PA.

Several properties of the Doherty PA can be derived [25]. We state the results here: (1) the technique extends the linear range by approximately 6 dB; (2) the efficiency reaches a theoretical maximum of 79% at full output power; (3) this efficiency is obtained if Z₀ in Fig. 12.67(a) is chosen equal to 2R_L.

The Doherty PA presents its own challenges with respect to IC design. The two transmission lines, especially that at the output, introduce considerable loss, degrading the efficiency. Also, for large swings, the transistor in the peaking stage turns on and off, producing discontinuities in the derivatives of the output current and possibly yielding a high adjacent channel power. In other words, the circuit may prove useful if signal compression must be avoided but not if ACPR must remain small.

12.10 Design Examples

Most power amplifiers employ two (or sometimes three) stages, with matching networks placed at the input, between the stages, and at the output (Fig. 12.69). The “driver” can be viewed as a buffer between the upconverter and the output stage, providing gain and driving the low input impedance of the latter. For example, if a PA must deliver +30 dBm, the two stages in Fig. 12.69 may have a gain of 25 to 30 dB, allowing the upconverter output to be in the range of 0 to + 5 dBm. Depending on the carrier frequency and the power levels, the first matching network, N₁, may be omitted, i.e., the driver simply senses the upconverter output voltage.

Figure 12.69 Typical two-stage PA.

The input and output matching networks in Fig. 12.69 serve different purposes: N₁ may provide a 50-Ω input impedance, whereas N₃ amplifies the voltage swings produced by the output stage (or, equivalently, transforms R_L to a lower value). The 50-Ω input impedance is necessary if the PA is designed as a stand-alone circuit that interfaces with the preceding circuit by means of external components. In an integrated TX, on the other hand, the upconverter/PA interface impedance can be chosen quite higher.

The matching network, N₂, in Fig. 12.69 is incorporated for practical reasons. Since the design may begin with load-pull measurements on the output transistor, the source impedance that this device must see for maximum efficiency is known and fixed once the design of the output stage is completed. Thus, the driver must drive such an input impedance, often requiring a matching network. In other words, the use of N₂ affords a modular design: first the output stage, next the driver, and last the interstage matching, with some iteration at the end. Without N₂, the driver and the output stage must be treated as a single circuit and co-designed for optimum performance. While possibly more complex, such a procedure may offer a somewhat higher efficiency because it avoids the loss of N₂.

In this section, we study a number of PA designs reported in the literature. As we will see, the efficiency and linearity vary substantially from one design to another. The reader is therefore cautioned that the comparison of the performance of different PAs is not straightforward. In particular, one must ask the following questions:

• What carrier frequency and maximum output power are targeted? The higher these are, the tighter the efficiency-linearity trade-off is.

• How much gain does the PA provide? Designs with lower gains tend to be more linear.

• Does the PA employ off-chip components? Most output matching networks are realized externally to avoid the loss of on-chip devices. For example, some designs incorporate bond wires as part of this network—even though such PAs may be called “fully integrated.”

• Does the IC technology provide thick metallization? For frequencies up to tens of gigahertz, a thick metal lowers the loss of on-chip inductors and transmission lines. (At higher frequencies, skin effect becomes dominant and the benefits of thick metalization diminish.)

• Does the design stress the transistor(s)? Many reported PAs employ a supply voltage equal to the maximum tolerable device voltage, V_max, but allow above-supply swings, possibly stressing the transistor(s).

• In what type of package is the PA tested? The package parasitics play a critical role in the performance of the PAs.

• Are the efficiency and ACPR measured at the same output power level? Some designs may quote the efficiency at the maximum power but the ACPR at a lower average output.

12.10.1 Cascode PA Examples

Nonlinear PAs can utilize cascode devices to reduce the stress on transistors. Figure 12.70 shows a class E example for the 900-MHz band [26]. Here, M₃ and M₄ turn on for part of the input swing. The use of a cascode device affords nearly twice the drain voltage swing (compared to a simple common-source stage), allowing the load resistance at the drain to be quadrupled. Consequently, the matching network need only transform 50Ω to about 4.4 Ω for an output power of 1 W, exhibiting smaller losses. For these power levels, the on-resistance of the M₁–M₂ branch is chosen to be about 1.2 Ω, smaller than other equivalent resistances in the matching network, but requiring a W/L of 15 mm/0.25 μm for each! The large drain capacitance of M₂ is absorbed in C₁, and the gate capacitance of M₁ is tuned by a 2-nH bond wire and an external variable capacitance. Inductors L₂ and L₃ are also realized by bond wires.

Figure 12.70 Class E PA example.

The input stage consisting of M₃ and M₄ in Fig. 12.70 operates as a class C amplifier because the transistors have a negligible bias current until the swing raises V_B above V_TH₃ or drops V_A below V_DD − |V_TH₄|. The PA achieves a power-added efficiency of 41% while delivering 0.9 W with V_DD₁ = 2.5 V and V_DD₂ = 1.8 V. The actual design employs two copies of the circuit in quasi-differential form and combines the outputs by means of an off-chip balun [26].

Figure 12.71(a) shows another example of cascode PAs [27]. In order to allow even larger swings at the drain of M₂, this topology bootstraps the gate of the cascode device to the output through R₁. In other words, since V_P and hence V_Q rise with V_out, M₂ now experiences less stress than if V_P were constant. Of course, if V_P tracks V_out with unity gain, then M₂ operates as a diode-connected device, limiting the minimum value of V_out.¹⁴ For this reason, capacitor C₁ is added, creating a fraction of the output swing at V_P. Figure 12.71(b) plots the circuit’s waveforms, revealing that the maximum drain-source voltages experienced by M₁ and M₂ can be made approximately equal [27], leading to a large tolerable output swing.

Figure 12.71 (a) Cascode PA with bootstrapping, (b) circuit’s waveforms, (c) addition of diode-connected device.

In the ideal case, what output voltage swing does the topology of Fig. 12.71(a) provide?

Solution:

In the ideal case, V_DD can be chosen equal to the maximum allowable drain-source voltage, V_max, so that V_out can swing from nearly zero to about 2V_DD = 2V_max. This is possible if at V_out = 2V_max, the gate voltage of M₂ is raised enough to yield V_DS₂ = V_DS₁ = V_max.

The topology of Fig. 12.71(a) can be further improved by making the bootstrap path somewhat unilateral so that the positive swings are larger than the negative swings. Depicted in Fig. 12.71(c), the modified circuit includes an additional series branch consisting of R₂ and a diode-connected device, M₃. As V_out rises, M₃ turns on, allowing the gate voltage of M₂ to follow. On the other hand, as V_out falls, M₃ turns off, and only R₁ can pull the gate down.

Explain what happens to the output duty cycle in the presence of asymmetric positive and negative swings.

Solution:

Since the swing above V_DD is larger than that below, the duty cycle must be less than 50% to yield an average voltage still equal to V_DD. The average output power nonetheless increases. This can be seen from the nearly ideal waveforms shown in Fig. 12.72, where we have

(12.152)

to ensure the average voltage is equal to V_DD. The average power is given by

(12.153)

which, from Eq. (12.152), reduces to

(12.154)

Figure 12.72 Bootstrapped cascode waveforms in the presence of asymmetric swings.

Thus, as V₁ increases and hence T₁ decreases, P_avg rises because V₂ ≈ V_DD.

Figure 12.73 shows the overall bootstrapped cascode PA design for the 2.4-GHz band [27]. The dashed box encloses the on-chip circuitry, L₁–L₃ denote bond wires, and T₁–T₇ are transmission lines implemented as traces on the printed-circuit board. The output stage utilizes device widths of W₃ = 2 mm and W₄ = 1.5 mm (with L = 0.18 μm), presenting an input capacitance of roughly 4 pF. In the driver stage, W₁ = 600 μm and W₂ = 300 μm.

Figure 12.73 Implementation of bootstrapped PA.

The circuit employs three matching networks: (1) T₁, C₁, and T₂ match the input to 50Ω; (2) T₃, L₂, and C₂ provide interstage matching; and (3) L₃, T₄–T₆, C₃, and C₄ transform the 50-Ω load to a lower resistance. Transmission line T₇ acts as an open circuit at 2.4 GHz.

If the drain voltage of M₄ in Fig. 12.73 swings from 0.1 V to 4 V and the PA delivers +24 dBm, by what factor must the output matching network transform the load resistance?

Solution:

For a peak-to-peak swing of V_pp = 3.9 V, the power reaches +24 dBm (=250 mW) if

(12.155)

where R_in is the resistance seen at the drain of M₄. It follows that

(12.156)

The output matching network must therefore transform the load by a factor of 6.6.

Operating with a supply of 2.4 V, the PA of Fig. 12.73 delivers a maximum (saturated) output of 24.5 dBm with a gain of 31 dB and a PAE of 49%. The output 1-dB compression is around 21 dBm.

Another example of cascode PA design is conceptually illustrated in Fig. 12.74(a) [28]. Here, a class B stage is added in parallel with a class A amplifier, contributing gain as the latter begins to compress. The operation is similar to that shown in Fig. 12.66(a) for the Doherty PA. The summation of the two outputs faces the same issue illustrated in Fig. 12.66(b), but if the two stages experience compression at the input, then their outputs can be simply summed in the current domain [28]. From this assumption emerges the PA circuit shown in Fig. 12.74(b), where M₁−M₄ form the main class A stage and M₅−M₆ the class B path. In this design, (W/L)_1,2 = 192/0.8, (W/L)_3,4 = 1200/0.34, and (W/L)_5,6 = 768/0.18 (all dimensions are in microns). Note that (W/L)_5,6 > (W/L)_1,2 because the class B devices take over at high output levels. The cascode transistors have a thicker oxide and longer channel so as to allow a higher voltage swing at the output.

Figure 12.74 (a) Parallel class A and B PAs to raise compression point, (b) realization of circuit.

The PA of Fig. 12.74(b) produces a maximum output of 22 dBm with a PAE of 44%. The small-signal gain is 12 dB and the output P₁_dB is 20.5 dBm.¹⁵

12.10.2 Positive-Feedback PAs

Our study of PAs in this chapter has revealed relatively large output transistors and the difficulty in driving them by the preceding stage. Now suppose, as conceptually illustrated in Fig. 12.75(a), the output transistor is decomposed into two, and one device, M₂, is driven by an inverted copy of V_out rather than by V_in. The input capacitance of the stage is therefore reduced proportionally. The implementation of the idea becomes straightforward in a differential design [Fig. 12.75(b)]. Since the input devices can now be substantially smaller, they are more easily switched, leading to a higher efficiency.

Figure 12.75 (a) Decomposition of an output device with one section driven by the output, (b) PA driving its own capacitance.

How should the drive capability be partitioned between M₁−M₃ and M₂−M₄ in Fig. 12.75(b)? We are tempted to allocate most of the required width to M₂−M₄ so as to minimize W₁ and W₃. However, as the design is skewed in this direction, two effects manifest themselves: (1) The capacitance at the output node becomes so large that it may dictate a small resonating inductance (L₁ and L₂) and hence a low output power. This issue is less problematic in class E stages where the output capacitance can be absorbed in the matching network. (2) As M₂ and M₄ become wider and carry a proportionally higher current, they form an oscillator with L₁ and L₂, which are loaded by the equivalent resistance, R_in.

Is it possible to employ an oscillatory PA stage? For a variable-envelope signal, such a circuit would create considerable distortion. However, for a constant-envelope waveform, an oscillatory stage may prove acceptable if its output phase can faithfully track the input phase. In other words, the cross-coupled oscillator must be injection-locked to the input with sufficient bandwidth so that the input phase excursions travel to the output unattenuated. If M₁ and M₃ in Fig. 12.75(b) are excessively small with respect to M₂ and M₄, then the input coupling factor may not guarantee locking. Of course, the lock range must be wide enough to cover the entire transmit band. In particular, the lock range can be expressed as

(12.157)

where Q ≈ L_1,2ω/(R_in/2). With a typical R_in of a few ohms, the lock range is usually quite wide.

Figure 12.76 shows a 1.9-GHz class E PA based on injection locking [29]. Both stages incorporate positive feedback, and the inductors are realized by bond wires. In this design, all transistors have a channel length of 0.35 μm, W₅−W₈ = 980 μm, W₁ = W₃ = 3600 μm, and W₂ = W₄ = 4800 μm. Also, L₁–L₄ = 0.37 nH, L₅ = L₆ = 0.8 nH, and C_D = 5.1 pF. A microstrip balun on the PCB converts the differential output to single-ended form.

Figure 12.76 Injection-locked PA example.

Operating with a 2-V supply and producing a maximum drain voltage of 5 V, the circuit of Fig. 12.76 delivers 1 W of power with a PAE of 48%. It is suited to constant-envelope modulation schemes such as GMSK.

An interesting issue here relates to output power control. While in other topologies, reduction of the input level eventually produces an arbitrarily small output (even if the circuit is nonlinear), injection-locked PAs deliver a relatively large output even if the input amplitude falls to zero (if the circuit oscillates). Figure 12.77 depicts an example where M_p controls the bias current of the output stage. However, to ensure negligible efficiency degradation at the maximum output level, the on-resistance of this device with V_cont ≈ 0 must be very small, requiring a very wide transistor.

Figure 12.77 Injection-locked PA with output power control.

12.10.3 PAs with Power Combining

We have observed in this chapter that transistor stress issues limit the supply voltage and hence output swing of PAs, dictating a matching network with a large impedance transformation ratio. We may alternatively ask, is it possible to directly add the output voltages of several stages so as to generate a large output power.

Let us return to the notion of transformer-based matching [Fig. 12.78(a)]. The on-chip realization of 1-to-n transformers poses many difficulties, especially if the primary and/or secondary must carry large currents. For example, both the series resistance and the inductance of the primary must be kept very small if power levels of greater than hundreds of milliwatts are to be delivered. Also, as explained in Chapter 7, stacked transformers contain various parasitics, and multi-turn planar transformers can hardly achieve a turns ratio of greater than 2. In other words, it is desirable to employ only 1-to-1 transformers.

Figure 12.78 (a) Output stage model using a 1-to-n transformer, (b) circuit using two 1-to-1 transformers to combine the outputs, (c) simple 1-to-1 transformer.

With these issues in mind, we pursue transformer-based matching but using the approach shown in Fig. 12.78(b). Here, the primaries of two 1-to-1 transformers are placed in parallel while their secondaries are tied in series [30]. We expect that the circuit amplifies the voltage swing by a factor of 2 because V₁ = V₂ = V_in. As exemplified by Fig. 12.78(c), 1-to-1 transformers more easily lend themselves to integration.

Determine the equivalent resistance seen by V_in in Fig. 12.78(b) if the transformer loss is neglected.

Solution:

Since the power delivered to R_L is P_out = (2V_in)²/R_L, where V_in denotes the rms value of the input, we have

(12.158)-(12.159)

Also, , yielding

(12.160)

which is identical to that of a 1-to-2 transformer driving a load resistance of R_L.

How is an actual output stage connected to the double-transformer topology of Fig. 12.78(b)? We can envision the simple arrangement depicted in Fig. 12.79(a), but the long, high-current-carrying interconnects between the amplifier and the two primaries introduce loss and additional inductance. Alternatively, we can “slice” the amplifier into two equal sections and place each in the close vicinity of its respective primary [Fig. 12.79(b)]. In this case, the amplifier input lines may be long, a less serious issue because they carry smaller currents.

Figure 12.79 (a) A single PA or (b) two PAs driving two transformers.

The concept illustrated in Fig. 12.79(b) can be extended to a multitude of 1-to-1 transformers so as to obtain a greater R_L/R_in ratio. Figure 12.80 shows a 2.4-GHz class E example employing four differential branches [30]. Each inductor is realized as an on-chip straight, wide metal line to handle large currents with a small resistance. For class E operation, a capacitor must be placed between the drains of each two input (differential) transistors, but the physical distance between N₁ and N₂, etc., inevitably adds inductance in series with the capacitor. Since the odd-numbered nodes in Fig. 12.80 have the same potential, and so do the even-numbered nodes, the capacitor is tied between, for example, N₂ and N₃ rather than between N₁ and N₂.

Figure 12.80 Power combining technique in [30].

Determine the differential resistance seen by each amplifier in Fig. 12.80 if the transformers are lossless.

Solution:

Returning to the simpler case illustrated in Fig. 12.79(b), we recognize that each of A₁ and A₂ sees twice the resistance seen by A₀, i.e., R_L/2. Thus, for the four-amplifier arrangement of Fig. 12.80, each differential pair sees a load resistance of R_L/4.

Designed for a 2-W output level [30], the circuit of Fig. 12.80 incorporates wide input transistors. To create input matching, inductors are inserted between and of adjacent branches. The differential inputs are first routed to the center of the secondary and then distributed to all four amplifiers, thus minimizing phase and amplitude mismatches. One factor limiting the efficiency of transformer-based PAs is the primary/secondary coupling factor, typically no higher than 0.6 for planar structures [30].

The design in Fig. 12.80 is realized in 0.35-μm technology with a 3-μm thick top metal layer, producing an output of 1.9 W (32.8 dBm) with a PAE of 41%. The PA provides a small-signal gain of 16 dB and runs from a 2-V supply. The output P₁_dB is around 27 dBm.

The gain of the above PA falls to 8.7 dB at full output power [30]. Estimate the power consumed by a stage necessary to drive this PA.

Solution:

The driver must deliver 32.8 dBm−8.7dB = 24.1 dBm ( = 257 mW). From previous examples, such a power can be obtained with an efficiency of about 40%, translating to a power consumption of about 640 mW. Since the above PA draws approximately 4 W from the supply,¹⁶ we note that the driver would require an additional 16% power consumption.

The multiple amplifiers driving the 1-to-1 transformers in the foregoing topologies can also be turned off individually, thus allowing output power control [31]. As illustrated in Fig. 12.81, if only M of the N amplifiers are on, then the output voltage swing drops by a factor of N/M. The notable benefit of this approach is that, as the output power is scaled down, it provides a higher efficiency than conventional PAs [31]. [The primary of the off stage(s) must be shorted by a switch.]

Figure 12.81 Power combining with switchable stages.

It is also possible to place the secondaries of the transformers in parallel so as to add their output currents [32].

12.10.4 Polar Modulation PAs

As explained in Section 12.7, a critical issue in polar modulation is the design of the supply modulation circuit for minimum degradation of efficiency and headroom. Figure 12.82 shows an example of an envelope path [33]. Here, a “delta modulator” (DM) generates a replica of V_env at the V_DD node of the PA output stage. The DM loop consists of a comparator, a buffer, and a low-pass filter.¹⁷ Owing to the high gain of the comparator, the loop ensures that the average output tracks the input even though the comparator produces only a binary waveform.

Figure 12.82 Polar modulation PA using a delta modulator for envelope path.

In the circuit of Fig. 12.82, the output stage’s average current flows through the LPF and the buffer. To minimize loss of efficiency and headroom, the LPF utilizes an (off-chip) inductor rather than a resistor, and the buffer must employ very wide transistors. Moreover, the DM loop bandwidth must accommodate the envelope signal spectrum and introduce a delay that can be matched by the phase path.

Figure 12.83 shows an example of a polar modulation transmitter [19]. In contrast to the topologies studied in Section 12.7, this architecture merges the envelope and phase loops: the highly-linear cascade of MX₁ and VGA₁ downconverts and reproduces both components at an IF, and the decomposition occurs at this IF. The output power is controlled by means of VGA₁ and VGA₂, e.g., as their gain increases, so does the output level such that the envelope at B remains equal to that at A. This also guarantees that the swing delivered to the feedback limiter is constant and it can be optimized for minimum AM/PM conversion. This transmitter consists of several modules realized in BiCMOS and GaAs technologies. The system delivers an output of +29 dBm in the EDGE mode at 900 MHz [19].

Figure 12.83 Polar modulation PA with envelope and phase feedback.

Depicted in Fig. 12.84(a) is another polar transmitter [18]. Here, the quadrature upconverter operates independently, generating an IF waveform having both envelope and phase components. The two signals are then extracted, with the former controlling the output stage and the latter driving an offset PLL.

Figure 12.84 (a) Polar modulation with envelope and phase signals separated at IF, (b) realization of output combining circuit.

Figure 12.84(b) shows the details of the TX front end. It consists of an envelope detector, a low-pass filter, and a double-balanced mixer driven by the VCO. Designed to deliver a power of +1 dBm, the mixer multiplies the envelope by the phase signal produced by the VCO, thus generating the composite waveform at the output [18]. As mentioned in Section 12.7, the dc offset in the envelope path leads to leakage of the phase component; this TX employs offset cancellation in the envelope path to suppress this effect.

The reader may wonder why the polar transmitters studied above do not employ a mixer of this type to combine the envelope and phase signals. Figure 12.84(b) suggests that the mixer requires a large voltage headroom, consuming substantial power. This technique is thus suited to low or moderate output levels.

12.10.5 Outphasing PA Example

Recall that outphasing transmitters incorporate two identical nonlinear PAs and sum their outputs to obtain the composite signal. Figure 12.85 shows the circuit realization of one PA for the 5.8-GHz band [34, 35]. An on-chip transformer serves as an input balun, applying differential phases to the driver stage. Inductors L₁ and L₂ and capacitors C₁ and C₂ provide interstage matching. The output stage operates in the class E mode, with L₃–L₅ and C₃ and C₄ shaping the nonoverlapping voltage and current waveforms. Note that the design assumes a load resistance of 12 Ω, a value provided by the power combiner described below.

Figure 12.85 PA used in an outphasing system.

If the above circuit operates with a 1.2-V supply and the minimum drain voltage is 0.15 V, estimate the peak drain voltage of M₃ and M₄.

Solution:

We note from Section 12.3.2 that the peak drain voltage is roughly equal to 3.56V_DD − 2.56V_DS. Thus, the drain voltage reaches 3.9 V. In the actual design, the peak drain voltage is 3.5 V [34, 35].

If the circuit of Fig. 12.85 delivers a power of 15.5 dBm to the 12-Ω load [34, 35], compare the drain voltage swing with that across R_L.

Solution:

Since 15.5 dBm corresponds to 35.5 mW, the peak-to-peak differential voltage swing across R_L is equal to . Thus, the class-E output network in fact reduces the voltage swing by a factor of 3.8 in this case.¹⁸ From a device stress point of view, this is undesirable.

In order to sum the outputs of the PAs, the outphasing TX employs a “Wilkinson combiner” rather than a transformer. Recall from Section 12.3.2 that a transformer ideally exhibits no loss but it allows interaction between the two PAs. By contrast, a Wilkinson combiner ideally provides isolation between the two input ports but suffers from loss. Shown in Fig. 12.86(a), the combiner consists of two quarter-wavelength transmission lines and a resistor, R_T.

Figure 12.86 (a) Wilkinson power combiner, (b) equivalent circuit with differential inputs, (c) equivalent circuit with a common-mode input, (d) input CM impedance.

The Wilkinson divider is commonly analyzed in terms of “odd” (differential) and “even” (common-mode) inputs. For differential inputs in Fig. 12.86(a), the output summing junction and the midpoint of R_T are at ac ground [Fig. 12.86(b)]. The λ/4 lines transform the short circuit to an open circuit, yielding

(12.161)

That is, the differential component of V_in₁ and V_in₂ causes dissipation in R_T but not in R_L. For a common-mode input, all the points in the circuit rise and fall in unison [Fig. 12.86(c)]. Thus, R_L can be replaced with two parallel resistors of value 2R_L, and R_T with an open circuit [Fig. 12.86(d)]. In this case, the impedance seen by each voltage source is given by

(12.162)

We recognize that the common-mode component of V_in₁ and V_in₂ causes dissipation in R_L but not in R_T.

How does the Wilkinson combiner of Fig. 12.86(a) achieve isolation between the input ports?

Solution:

If the impedance seen by each input voltage source is constant and independent of differential or common-mode components, then V_in₁ does not “feel” the presence of V_in₂ and vice versa. This condition is satisfied if

(12.163)

(12.164)

Denoting all of these impedances by Z_in, we write

(12.165)

The result expressed by Eq. (12.162) reveals that the Wilkinson combiner can also transform the load impedance to a desired value if Z₀ is chosen properly. The outphasing system in [34, 35] transforms R_L = 50Ω to Z_in = 12 Ω using Z₀ = 35 Ω. The combining of the two differential PA outputs requires four transmission lines, each having a length of 2.8 mm. The on-chip lines are wrapped around the PA circuitry and realized as shown in Fig. 12.87.

Figure 12.87 On-chip Wilkinson combiner used at the output of outphasing system.

Designed in 0.18-μm technology, the outphasing PA of Fig. 12.85 incorporates thick-oxide transistors to sustain a peak drain voltage of 3.5 V. The overall circuit generates an output of 18.5 dBm with an efficiency of 47% while amplifying a 64-QAM OFDM signal.

References

[1] S. Cripps, RF Power Amplifiers for Wireless Communications, Norwood, MA: Artech House, 1999.

[2] A. Grebebbikov, RF and Microwave Power Amplifier Design, Boston: McGraw-Hill, 2005.

[3] A. Johnson, “Physical Limitations on Frequency and Power Parameters of Transistors,” RCA Review, vol. 26, pp. 163–177, 1965.

[4] A. A. Saleh, “Frequency-Independent and Frequency-Dependent Nonlinear Models of TWT Amplifiers,” IEEE Tran. Comm., vol. COM-29, pp. 1715–1720, Nov. 1981.

[5] C. Rapp, “Effects of HPA-Nonlinearity on a 4-DPSK/OFDM-Signal for a Digital Sound Broadband System,” Rec. Conf. ECSC, pp. 179–184, Oct. 1991.

[6] J. C. Pedro and S. A. Maas, “A Comparative Overview of Microwave and Wireless Power-Amplifier Behavioral Modeling Approaches,” IEEE Tran. MTT, vol. 53, pp. 1150–1163, April 2005.

[7] H. L. Kraus, C. W. Bostian, and F. H. Raab, Solid State Radio Engineering, New York: Wiley, 1980.

[8] S. C. Cripps, “High-Efficiency Power Amplifier Design,” presented in Short Course: RF ICs for Wireless Communication, Portland, June 1996.

[9] J. Staudinger, “Multiharmonic Load Termination Effects on GaAs MESFET Power Amplifiers,” Microwave J. pp. 60–77, April 1996.

[10] N. O. Sokal and A. D. Sokal, “Class E - A New Class of High-Efficiency Tuned Single-Ended Switching Power Amplifiers,” IEEE J. of Solid-State Circuits, vol. 10, pp. 168–176, June 1975.

[11] F. H. Raab, “An Introduction to Class F Power Amplifiers,” RF Design, pp. 79–84, May 1996.

[12] H. Seidel, “A Microwave Feedforward Experiment,” Bell System Technical J., vol. 50, pp. 2879–2916, Nov. 1971.

[13] E. E. Eid, F. M. Ghannouchi, and F. Beauregard, “Optimal Feedforward Linearization System Design,” Microwave J., pp. 78–86, Nov. 1995.

[14] D. P. Myer, “A Multicarrier Feedforward Amplifier Design,” Microwave J., pp. 78–88, Oct. 1994.

[15] R. E. Myer, “Nested Feedforward Distortion Reduction System,” US Patent 6127889, Oct., 2000.

[16] L. R. Kahn, “Single-Sideband Transmission by Envelope Elimination and Restoration,” Proc. IRE, vol. 40, pp. 803–806, July 1952.

[17] W. B. Sander, S. V. Schell, and B. L. Sander, “Polar Modulator for Multi-Mode Cell Phones,” Proc. CICC, pp. 439–445, Sept. 2003.

[18] M. R. Elliott et al., “A polar modulator transmitter for GSM/EDGE,” IEEE J. of Solid-State Circuits, vol. 39, pp. 2190–2199, Dec. 2004.

[19] T. Sowlati et al., “Quad-band GSM/GPRS/EDGE Polar Loop Transmitter,” IEEE J. of Solid-State Circuits, vol. 39, pp. 2179–2189, Dec. 2004.

[20] H. Chireix, “High-Power Outphasing Modulation,” Proc. IRE, pp. 1370–1392, Nov. 1935.

[21] D. C. Cox, “Linear Amplification with Nonlinear Components,” IEEE Tran. Comm., vol. 22, pp. 1942–1945, Dec. 1974.

[22] D. C. Cox and R. P. Leek, “Component Signal Separation and Recombination for Linear Amplification with Nonlinear Components,” IEEE Tran. Comm., vol. 23, pp. 1281–1287, Nov. 1975.

[23] F. J. Casadevall, “The LINC Transmitter,” RF Design, pp. 41–48, Feb. 1990.

[24] S. Moloudi et al., “An Outphasing Power Amplifier for a Software-Defined Radio Transmitter,” ISSCC Dig. Tech. Papers, pp. 568–569, Feb. 2008.

[25] W. H. Doherty, “A New High Efficiency Power Amplifier for Modulated Waves,” Proc. IRE, vol. 24, pp. 1163–1182, Sept. 1936.

[26] C. Yoo and Q. Huang, “A Common-Gate Switched, 0.9 W Class-E Power Amplifier with 41% PAE in 0.25-μm CMOS,” VLSI Circuits Symp. Dig. Tech. Papers, pp. 56–57, June 2000.

[27] T. Sowlati and D. Leenaerts, “2.4 GHz 0.18-μm CMOS Self-Biased Cascode Power Amplifier with 23-dBm Output Power,” IEEE J. of Solid-State Circuits, vol. 38, pp. 1318–1324, Aug. 2003.

[28] Y. Ding and R. Harjani, “A CMOS High-Efficiency +22-dBm Linear Power Amplifier,” Proc. CICC, pp. 557–560, Sept. 2004.

[29] K. Tsai and P. R. Gray, “A 1.9-GHz 1-W CMOS Class E Power Amplifier for Wireless Communications,” IEEE J. Solid-State Circuits, vol. 34, pp. 962–970, 1999.

[30] I. Aoki et al., “Fully-Integrated CMOS Power Amplifier Design Using the Distributed Active Transformer Architecture,” IEEE J. Solid-State Circuits, vol. 37, pp. 371–383, March 2002.

[31] G. Liu et al., “Fully Integrated CMOS Power Amplifier with Efficiency Enhancement at Power Back-Off,” IEEE J. Solid-State Circuits, vol. 43, pp. 600–610, March 2008.

[32] A. Afsahi and L. E. Larson, “An Integrated 33.5 dBm Linear 2.4 GHz Power Amplifier in 65 nm CMOS for WLAN Applications,” Proc. CICC, pp. 611–614, Sept. 2010.

[33] D. K. Su and W. J. McFarland, “An IC for Linearizing RF Power Amplifiers Using Envelope Elimination and Restoration,” IEEE J. Solid-State Circuits, vol. 33, pp. 2252–2259, Dec. 1998.

[34] A. Pham and C. G. Sodini, “A 5.8-GHz 47% Efficiency Linear Outphase Power Amplifier with Fully Integrated Power Combiner,” IEEE RFIC Symp. Dig. Tech. Papers, pp. 160–163, June 2006.

[35] A. Pham, Outphasing Power Amplifiers in OFDM Systems, PhD Dissertation, MIT, Cambridge, MA, 2005.

Problems

12.1. Following the derivations leading to Eq. (12.16), prove that the other 50% of the supply power is dissipated by the transistor itself.

12.2. In Fig. 12.16, plot the current from V_DD as a function of time. Does this circuit provide the benefits of differential operation? For example, is the bond wire inductance in series with V_DD critical?

12.3. Prove that in Fig. 12.17, the voltage swings above and below V_DD are respectively equal to and , where I_p denotes the peak drain current. (Hint: the average value of V_X and V_Y must be equal to V_DD.)

12.4. From Example 12.11, sketch the scaling factor for the output transistor width as α varies from near zero to π/2.

12.5. Compute the maximum efficiency of the cascode PA shown in Fig. 12.31(a). Assume M₁ and M₂ nearly turn off but their drain currents can be approximated by sinusoids.

12.6. Assuming a third-order nonlinearity for the envelope detector in Fig. 12.46, prove that the output spectrum of the system exhibits growth in the adjacent channels.

12.7. Repeat the calculations leading to Eq. (12.77) but assuming that the phase signal experiences a delay mismatch of ΔT.

12.8. If transistor M₂ in Fig. 12.49(b) has an average current of I₀ and an average drain-source voltage of V₀, determine the efficiency of the stage. Neglect the on-resistance of M₁.

12.9. Derive Eq. (12.115) if θ(t) = sin⁻¹[V_env(t)/V₁].

12.10. Does the Doherty amplifier of Fig. 12.67(a) operate properly if the input is driven by an ideal voltage source? Explain your reasoning.

12.11. In the Doherty amplifier of Fig. 12.67(a), the value of α is chosen equal to 0.5. Plot the waveforms at x = 0 and x = λ/4, assuming Z₀ = R_L.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 12. Power Amplifiers

Create new playlist

Sign In

Sign Up

Chapter 12. Power Amplifiers

12.1 General Considerations

Solution:

Solution:

Solution:

12.1.1 Effect of High Currents

Solution:

12.1.2 Efficiency

Solution:

12.1.3 Linearity

12.1.4 Single-Ended and Differential PAs

Solution:

12.2 Classification of Power Amplifiers

12.2.1 Class A Power Amplifiers

Solution:

Solution:

Solution:

Conduction Angle

12.2.2 Class B Power Amplifiers

Solution:

Class AB Power Amplifiers

12.2.3 Class C Power Amplifiers

Solution:

12.3 High-Efficiency Power Amplifiers

12.3.1 Class A Stage with Harmonic Enhancement

12.3.2 Class E Stage

Solution:

12.3.3 Class F Power Amplifiers

Solution:

12.4 Cascode Output Stages

Solution:

Solution:

12.5 Large-Signal Impedance Matching

Load-Pull Measurement

12.6 Basic Linearization Techniques

12.6.1 Feedforward

Solution:

Solution:

Solution:

12.6.2 Cartesian Feedback

12.6.3 Predistortion

Solution:

12.6.4 Envelope Feedback

Solution:

Envelope Detection

12.7 Polar Modulation

12.7.1 Basic Idea

Solution:

Solution:

12.7.2 Polar Modulation Issues

12.7.3 Improved Polar Modulation

Solution:

Solution:

Solution:

Other Issues

12.8 Outphasing

12.8.1 Basic Idea

Solution:

12.8.2 Outphasing Issues

Solution:

Solution:

Solution:

12.9 Doherty Power Amplifier

12.10 Design Examples

12.10.1 Cascode PA Examples

Solution:

Solution:

Solution:

12.10.2 Positive-Feedback PAs

12.10.3 PAs with Power Combining

Solution:

Solution:

Solution:

12.10.4 Polar Modulation PAs

12.10.5 Outphasing PA Example

Solution:

Table of Contents for
Chapter 12. Power Amplifiers