A Low-power Impulse Radio
Ultra-wideband CMOS
Radio-frequency Transceiver

A dissertation submitted to
ETH ZURICH
for the degree of
Doctor of Sciences

presented by
DAVID BARRAS
Dipl. El.-Ing. EPFL
born 17 October 1972
citizen of Chermignon (VS), Switzerland

accepted on the recommendation of
Prof. Dr. Heinz Jäckel, examiner
Prof. Dr. Christian Enz, co-examiner
Dr. Walter Hirt, co-examiner

2010
What gets us into trouble is not what we don’t know, it’s that we know for sure that just ain’t so.

- Mark Twain

To Yoann
Acknowledgments

There are many people that I would like to thank for their invaluable support and contribution to this work. My warmest gratitude goes to my research advisor, Prof. Dr. Heinz Jaeckel, who has offered me the opportunity to undertake this work at the Laboratory of Electronics (IfE) of the Swiss Federal Institute of Technology (ETHZ), Zurich, Switzerland. I would especially like to thank him for his unfailing support and confidence that allowed me to pursue this work in the best conditions.

I would also like to express my thanks to Dr. Frank Ellinger (now Professor at TU Dresden, Germany). His optimistic and dynamic personality, his constant encouragements were of essential help during the first half this project. My deepest appreciation also goes to Dr. Christian Kromer for the supervision of the second half of this thesis. Without the help and the skills in CMOS analog circuits of these two persons, this thesis would not exist in its present form.

I am very indebted to Dr. Walter Hirt of IBM Research Laboratory, Ruschlikon, Switzerland, whose availability and commitment greatly helped me in the realization of this work. During these years, I could profit from his vast knowledge and expertise in UWB. The multiple fruitful discussions and his reviews of the manuscripts have contributed to a large extent to this work. This thesis also benefited from the collaboration between ETHZ and the IBM Zurich Research Laboratory, Ruschlikon, Switzerland, within the joint Center for Advanced Silicon Electronics (CASE). In this context, I would like to acknowledge both Dr. Martin Schmatz and Dr. Basanth Jagannathan for their continuous support.

I would particularly like to express my gratitude to Prof. Dr. Christian Enz from EPFL, Lausanne, Switzerland, for agreeing to be co-examiner and for taking the time to evaluate this work.

In addition, I wish to thank to all the members of the IfE, and in particular my officemates George von Bueren, Silvan Wehrli and Lucio Rodoni for the numerous technical and non-technical discussions that have made the days (and some nights...) of this PhD very pleasant. Many thanks also to Martin Lanz, Urs Egger, Thomas Kleier and Hans-Ruedi Benedikter for assistance in designing PCB, mounting chips on test substrates and measurements. I’m very obliged to some very motivated students I had the opportunity to supervise and who were of great...
help during these years. My deepest gratitude goes to Robert Meier-
Piening who was, through its Master thesis on the baseband ASIC, a real contributor to this work. The standalone transmit-receive link would not exist without him. This thesis also benefited from a collaboration with the Laboratory for Electromagnetic Fields and Microwave Electronics (IFH). In this context, I would like to thank, Corrado Carta, Oliver Lauer, Juerg Froehlich, Marco Zahner and Michael Mueri.

Special thanks go to Ruth Zaehringer for always being there to help with any imaginable non-scientific needs a PhD might possibly entail. I am also much obliged to the system administrators, Beat “Zosh” Mueller, Fredy Mettler and Frank K. Gurkaynak.

This work was partly supported by the Swiss Innovation Promotion Agency (KTI/CTI) under Project Number KTS-6322.1.

Last but by no means least, I would like to express my dearest thank to my parents, Marie-Paule and Gaston, and my brother Jean-Philippe, for their support during these years, and especially for the delicious “raclette-apéro” organized after the defense of my Ph. D. Finally, I express my deepest gratitude to my wife Camille for her love and patience.
# Table of Contents

**Acknowledgments**

i

**Abstract**

xviii

**Résumé**

xx

1. **Introduction**
   1.1. Motivation and Background ........................................ 1
   1.2. Ultra-Wideband: One Definition ? ................................. 2
   1.3. KTI Research Project ................................................. 3
   1.4. Content and Organization of the Thesis .......................... 4
   1.5. Definitions: Impulse, Pulse and Burst ............................ 5

2. **Ultra-Wideband Radio Technologies** ................................ 7
   2.1. UWB History ...................................................................... 7
   2.2. Regulations in the USA: FCC Amendment .......................... 8
   2.3. Activities and Regulation in Europe ................................. 8
       2.3.1. Overview of European Regulatory Bodies ..................... 9
       2.3.2. The First Mandate ............................................... 10
       2.3.3. The Extremely Conservative ECC Report 64 ................ 11
       2.3.4. The 2nd and 3rd Mandate: Mitigation Techniques ......... 11
       2.3.5. The European Decision: ECC/DEC/(06)04 .................. 12
       2.3.6. The Amendment to the Final Decision ....................... 13
       2.3.7. Some Considerations on Mitigation Techniques ............ 14
   2.4. ITU-R Task Group 1/8 .................................................. 16
   2.5. This Work vs. the Regulations ....................................... 17
   2.6. The Standardization Processes ....................................... 19
       2.6.1. The IEEE 802.15 Working Group .............................. 19
       2.6.2. High-speed UWB: the Failure of TG3a ....................... 19
       2.6.3. The Standard Ecma 368/369 .................................... 21
       2.6.4. Low Data Rate UWB: TG4a ................................... 21
   2.7. Conclusions ................................................................. 22
3. Transceiver Planning

3.1. Key Considerations and Main Objectives ............................................. 23
3.2. Radio Channel Basics ............................................................................ 24
  3.2.1. Introduction ......................................................................................... 24
  3.2.2. Wireless Channel Metrics ................................................................. 26
  3.2.3. Modeling Concepts ............................................................................ 28
3.3. The UWB Channel .................................................................................... 29
  3.3.1. Introduction ......................................................................................... 29
  3.3.2. Early UWB Multipath Models .............................................................. 30
  3.3.3. The IEEE 802.15.4a Model ................................................................. 31
  3.3.4. The Path Loss of the IEEE 802.15.4a Model ...................................... 33
  3.3.5. Channel Dynamics ............................................................................. 35
  3.3.6. Conclusions ......................................................................................... 35
3.4. IR-UWB Signals ....................................................................................... 36
  3.4.1. Gaussian IR-UWB Pulse Metrics ....................................................... 36
  3.4.2. Frequency Translated Gaussian Pulse ................................................. 38
  3.4.3. Pulse Bandwidth (Ideal Case) .............................................................. 39
  3.4.4. How Wide Should an IR-UWB Pulse Be? ........................................... 40
  3.4.5. Non-dithered Signal Spectrum ............................................................ 42
3.5. The Modulation Scheme .......................................................................... 44
  3.5.1. Modulation Schemes for IR-UWB ...................................................... 44
    OOK ............................................................................................................ 46
    Binary PPM (BPPM) .................................................................................. 47
    Noncoherent Binary FSK .......................................................................... 47
  3.5.2. The Need for Carrier-based IR-UWB ................................................. 49
  3.5.3. Implementation Issues for Down-conversion ..................................... 50
  3.5.4. Proposed Modulation Scheme ............................................................ 52
  3.5.5. IR-UWB BFSK Signalling Parameters ............................................. 53
3.6. Receiver Architecture ............................................................................. 54
3.7. Demodulation Principle ......................................................................... 57
  3.7.1. Introduction ......................................................................................... 57
  3.7.2. Outline and Methodology ................................................................. 58
  3.7.3. Transceiver Model ............................................................................. 59
  3.7.4. Envelope Detection ........................................................................... 61
    Principle ..................................................................................................... 61
    Phase Statistics .......................................................................................... 62
    Zero-Crossings Instants $t_k$ .................................................................... 65
    Remarks on I/Q Mismatch in Signal Demodulation ................................ 68
    Output for Gaussian Envelope Input Signals ........................................ 68
    Transformation of SNR $\tilde{\gamma}_e$ into $E_{tx}/N_0$ .................................. 69
3.7.5. Noise PSD at the S&H Output .................................. 70
3.7.6. Bit Error Rate in an AWGN Channel ......................... 73
3.8. Demodulation in UWB Channels ................................. 75
  3.8.1. Integrate-and-Dump Receive Filter .......................... 75
  3.8.2. Optimum Integration Time .................................. 76
  3.8.3. Other Multipath Effects ................................... 76
  3.8.4. Effect of Interpulse Interference (IPI) .................... 79
3.9. Link Budget ....................................................... 81
  3.9.1. Performances in Free-space with Optimal Filter ......... 82
  3.9.2. Performance with I&D Filter ................................. 82
  3.9.3. Summary ...................................................... 82
3.10. Implementation Losses .......................................... 83
  3.10.1. I/Q mismatch ............................................... 83
  3.10.2. BER Degradation due to Channel Filter .................. 84
  3.10.3. BER Degradation due to a Frequency Offset .............. 85
  3.10.4. BER Degradation due to LO Phase Noise ................. 86
    Phase Noise Effect ............................................... 86
    Phase Noise Modelling for High Level Simulation .......... 88
    Simulation Results ............................................... 89
3.11. Interferences and Linearity .................................... 91
  3.11.1. BER Degradation in the Presence of Interferers ........ 91
    CW Interfering Signal ......................................... 93
    IR-UWB Interfering Signal .................................... 93
  3.11.2. Potential Interfering Signals ................................ 94
  3.11.3. Front-End Linearity Requirement .......................... 94
  3.11.4. Second- and Third-Order IMP ............................ 96
    Out-of-band CW Interferences .................................. 96
3.12. Summary and Conclusions ...................................... 96

4. Fully-integrated CMOS IR-UWB Transmitter .......................... 101
  4.1. Issues in IR-UWB Signal Generation .......................... 101
  4.2. Transmitter (Tx) Architectures ............................... 102
    4.2.1. Up-conversion Quadrature Tx Architectures ............ 102
    4.2.2. Proposed Transmitter: Direct Modulation .............. 103
    4.2.3. Signalling Schemes ...................................... 105
  4.3. Analog Pulse Shaping for LRPM ................................ 106
    4.3.1. Introduction and Principle ................................ 106
    4.3.2. Tx Parameters under FCC and ECC Regulations .......... 108
  4.4. The Modulator and the Output Stage .......................... 109
    4.4.1. Architecture Overview and Specifications .............. 109
### 4.4.2. The AM/PM Modulation Section
- Amplitude Modulation Circuit
- Polarity Modulation Circuit

### 4.4.3. Power Amplifier Section
- Description
- Post-layout Simulations

### 4.5. Pulse Shaping Filter
- Introduction
- Implementation
- Automatic Tuning Control
- Experimental Results

### 4.6. Tx Measurements
- BFSK IR-UWB vs. FCC Regulation
- BFSK IR-UWB vs. ECC Regulation
- PMCW Measurements
- LO leakage

### 4.7. Performance Comparison with Other Tx
- Oscillator-based
- Fully-Digital Solutions

### 4.8. Summary and Conclusions

---

### 5. The PLL

#### 5.1. Introduction

#### 5.2. PLL Dimensioning
- PLL Architecture
  - Open-loop Equations
  - Closed-loop Equations
- Settling Time for Direct Modulation
- Loop delay
- Stability Criterion

#### 5.3. The Wideband Oscillator
- Introduction
- Ring Oscillator
  - Oscillation Frequency
  - Voltage Swing $V_{pp}$
- Core Inverter Cell Optimization for Tuning Range
- Inverter Cell Equivalent Circuit
- Tuning Range
- How Many Oscillators are Needed?
- Phase Noise
5.3.5. Simulation Results of the Oscillating Core . . . . 157
5.4. The Frequency Divider . . . . . . . . . . . . . . . . . . 158
  5.4.1. Proposed Architecture . . . . . . . . . . . . . . . . 158
  5.4.2. The Tri-Modulus Prescaler Principle . . . . . . . . 162
  5.4.3. High-Frequency Fixed-Modulus Prescaler . . . . . 164
    Wideband Sensitivity Issues . . . . . . . . . . . . . . . 164
  5.4.4. The Phase-Rotator . . . . . . . . . . . . . . . . 165
    Phase-Selection Circuit . . . . . . . . . . . . . . . . 166
    Divide-by-3/4 Medium Speed Prescaler . . . . . . . . . 167
    Phase-Selector Driver . . . . . . . . . . . . . . . . . 169
5.5. PLL Measurements . . . . . . . . . . . . . . . . . . . . 171
  5.5.1. PLL Dynamics for Tx: Frequency Modulation . . 172
  5.5.2. PLL for Rx LO Generation . . . . . . . . . . . . . . 172
    Phase Noise . . . . . . . . . . . . . . . . . . . . . . . 172
5.6. Summary and Conclusions . . . . . . . . . . . . . . . . . 173

6. Analog Front-End Receiver 177
  6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
  6.2. Low-Noise Amplifier (LNA) . . . . . . . . . . . . . . . . 179
    6.2.1. LNA Circuit Overview . . . . . . . . . . . . . . . . 179
    6.2.2. Wideband Design Techniques . . . . . . . . . . . . 181
      Gain Flatness Constraint on Q-factor . . . . . . . . . . 181
      Impedance Matching Constraint on Q-factor . . . . . . 181
      Influence of $C_{gd}$ on Z-Match and Q-factor . . . . . 185
      Degeneration Inductance with Miller Effect . . . . . . 187
    6.2.3. Cascode LNA Transconductance . . . . . . . . . . 187
      Input Stage Transconductance $G_m$ . . . . . . . . . . 188
      Cascode Current Gain . . . . . . . . . . . . . . . . . 189
      Integrated 2.45-GHz Notch Filter . . . . . . . . . . . 189
    6.2.4. Noise Analysis . . . . . . . . . . . . . . . . . . . . 190
      Noise Due to the Source . . . . . . . . . . . . . . . . 191
      Noise Due to the Common Source Stage . . . . . . . . . 192
      Channel Thermal Noise . . . . . . . . . . . . . . . . . 192
      Channel Thermal Noise Model . . . . . . . . . . . . . . 194
      Induced Gate Noise Model . . . . . . . . . . . . . . . . 194
      Induced Gate Noise of a Common-Source Stage . . . . . 195
      Cascode Noise . . . . . . . . . . . . . . . . . . . . . . . 199
      Noise Contribution of the Resistive Load . . . . . . . . 203
      LNA Noise Figure . . . . . . . . . . . . . . . . . . . . . 203
      Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 204
6.2.5. Layout Issues ........................................ 205
    Single-ended vs. Differential .......................... 205
6.2.6. Simulation and Measurements ....................... 205
    S-parameters ........................................... 205
    Noise Figure Measurements ............................. 207
    Linearity ............................................. 207
6.2.7. Summary and Conclusions ............................ 211
6.3. Down-Conversion Mixers ............................... 212
    6.3.1. Introduction .................................... 212
    6.3.2. Implemented Mixer ................................ 214
        Current Stealing .................................. 214
        Transconductance Section ......................... 215
        LO Switch Section .................................. 217
        Output Common Mode Regulation ................... 219
6.4. RF Front-End Characterization .......................... 220
    6.4.1. Gain, Input Matching and Noise Figure ......... 220
    6.4.2. Linearity ....................................... 221
        Intermodulation IIP2 and IIP3 ..................... 221
        Harmonic Distortions HD2 ........................... 222
    6.4.3. Test Chip ....................................... 223
6.5. The Variable Gain Amplifier (VGA) ..................... 224
    6.5.1. Specifications ................................... 224
    6.5.2. Required Gain-Bandwidth Product ............... 225
    6.5.3. Design Guideline ................................ 226
    6.5.4. Automatic Gain Control (AGC) Loop ............. 227
        Block Schematic .................................... 227
        AGC Loop Equation .................................. 228
        Detector and Loop Filter: Implementation ....... 230
        AGC Loop Behavior with Squarer Detector ....... 232
    6.5.5. Amplification Cell ................................ 234
    6.5.6. PMOS Active Load ................................ 236
    6.5.7. Measurements of the Implemented IC ............ 238
        Gain and Bandwidth ................................... 241
        Linearity ........................................... 241
        Noise .............................................. 244
        I/Q Imbalance ....................................... 245
        AGC Behavior ....................................... 246
    6.5.8. VGA Performance Summary ......................... 247
6.6. Receiver Summary ..................................... 248
7. Digital Back-End and Experimental Results 251
   7.1. The Digital Baseband ......................... 251
   7.2. The Analog Section: Demodulation & Detection .... 253
      7.2.1. Principles ................................ 253
      7.2.2. Quadrature Pulse Demodulation ............. 253
      7.2.3. Signal Detection .......................... 255
      7.2.4. Comparator Bank ........................... 257
   7.3. The Digital Section: Synchronization Algorithm .... 258
      7.3.1. Issues during Initial Signal Acquisition .... 259
      7.3.2. Proposed Solution .......................... 259
      7.3.3. Improvement for Reduced SNR ............... 261
      7.3.4. CORDIC Algorithm .......................... 262
      7.3.5. SNR Estimation ............................. 264
      7.3.6. Optimum Number of Pulses M during Cold Start 264
      7.3.7. The Tracking Algorithm, “Early/Late” Revisited 268
      7.3.8. Timing ..................................... 268
   7.4. Baseband ASIC Power Consumption .................. 270
   7.5. BER Measurements ............................... 273
   7.6. SNR Estimate Through CORDIC Vector Length ....... 274
      7.6.1. Measurements ............................... 274
   7.7. BER Measurements with Free-running VCO .......... 277
   7.8. Clock Offset Sensitivity ........................ 278
   7.9. Comparison with Other Systems .................... 279
   7.10. Summary and Conclusions ....................... 280

8. Conclusions 283
   8.1. IR-UWB Transmitters ........................... 283
   8.2. IR-UWB Receiver ............................... 285
   8.3. Outlook ....................................... 286

A. Equations 289
   A.1. Influence of $C_{gd}$ on Input Q-factor of the LNA .... 289
   A.2. Common-source Transconductance $G_m$ ............ 290

B. Low-power UWB Wavelets Generator 293
   B.1. Introduction .................................. 293
   B.2. UWB Signal Generation Techniques ............... 295
   B.3. Design Constraints ............................. 296
      B.3.1. Oscillator Equivalent Circuit ............... 296
      B.3.2. Wavelet Generator’s Settling Time .......... 300
   B.4. Circuit Implementation ........................ 302
## List of Figures

<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1.1</td>
<td>UWB applications</td>
<td>2</td>
</tr>
<tr>
<td>2.1</td>
<td>European regulatory bodies</td>
<td>10</td>
</tr>
<tr>
<td>2.2</td>
<td>UWB spectrum masks</td>
<td>16</td>
</tr>
<tr>
<td>2.3</td>
<td>Working Groups 802.15</td>
<td>20</td>
</tr>
<tr>
<td>3.1</td>
<td>Number of IEEE publications on UWB since 1988</td>
<td>24</td>
</tr>
<tr>
<td>3.2</td>
<td>Illustration of the cluster phenomenon</td>
<td>30</td>
</tr>
<tr>
<td>3.3</td>
<td>Average power decay profiles of IEEE802.15.4 channels</td>
<td>32</td>
</tr>
<tr>
<td>3.4</td>
<td>Impulse response of channel model CM3</td>
<td>33</td>
</tr>
<tr>
<td>3.5</td>
<td>Normalized Gaussian pulse in time domain</td>
<td>37</td>
</tr>
<tr>
<td>3.6</td>
<td>Fourier transform of the channel’s impulse response</td>
<td>43</td>
</tr>
<tr>
<td>3.7</td>
<td>Unmodulated pulse amplitude vs. FCC mask</td>
<td>44</td>
</tr>
<tr>
<td>3.8</td>
<td>Block schematic of the generic noncoherent receiver</td>
<td>45</td>
</tr>
<tr>
<td>3.9</td>
<td>Typical modulation schemes and receivers for IR-UWB</td>
<td>48</td>
</tr>
<tr>
<td>3.10</td>
<td>Bit error rate performances of noncoherent receivers</td>
<td>49</td>
</tr>
<tr>
<td>3.11</td>
<td>Down-conversion principles</td>
<td>51</td>
</tr>
<tr>
<td>3.12</td>
<td>Principle of 2-FSK/AM-C modulation</td>
<td>53</td>
</tr>
<tr>
<td>3.13</td>
<td>Simulated and theoretical non-dithered PSD</td>
<td>55</td>
</tr>
<tr>
<td>3.14</td>
<td>Simulated and theoretical dithered PSD</td>
<td>55</td>
</tr>
<tr>
<td>3.15</td>
<td>Low-complexity UWB receiver architecture</td>
<td>56</td>
</tr>
<tr>
<td>3.16</td>
<td>Implementation of the BFSK demodulator</td>
<td>58</td>
</tr>
<tr>
<td>3.17</td>
<td>System model of the whole transceiver</td>
<td>60</td>
</tr>
<tr>
<td>3.18</td>
<td>Functional description of the BFSK demodulator</td>
<td>61</td>
</tr>
<tr>
<td>3.19</td>
<td>Comparison of the exact and approximated pdf’s</td>
<td>65</td>
</tr>
<tr>
<td>3.20</td>
<td>Model comparison at different channel filter bandwidths</td>
<td>67</td>
</tr>
<tr>
<td>3.21</td>
<td>Model of the sample-and-hold device</td>
<td>71</td>
</tr>
<tr>
<td>3.22</td>
<td>Approximation of the noise PSD at the receive filter input</td>
<td>72</td>
</tr>
<tr>
<td>3.23</td>
<td>Comparison of theoretical and simulated BER’s</td>
<td>74</td>
</tr>
<tr>
<td>3.24</td>
<td>Simulated eye diagram of demodulator output</td>
<td>76</td>
</tr>
<tr>
<td>3.25</td>
<td>Optimum integration time ( T_I ) for different indoor channels</td>
<td>77</td>
</tr>
<tr>
<td>3.26</td>
<td>Simulated BER for channel model CM1</td>
<td>78</td>
</tr>
<tr>
<td>3.27</td>
<td>Simulated BER for channel model CM7</td>
<td>78</td>
</tr>
<tr>
<td>3.28</td>
<td>Influence of ISI for IEEE channel CM3</td>
<td>80</td>
</tr>
<tr>
<td>3.29</td>
<td>Sensitivity degradation against mismatch</td>
<td>84</td>
</tr>
<tr>
<td>3.30</td>
<td>Sensitivity degradation against corner frequency</td>
<td>85</td>
</tr>
<tr>
<td>Section</td>
<td>Title</td>
<td>Page</td>
</tr>
<tr>
<td>---------</td>
<td>----------------------------------------------------------------------</td>
<td>------</td>
</tr>
<tr>
<td>5.10</td>
<td>VCO frequency and gain vs. control voltage and $T^\circ$</td>
<td>159</td>
</tr>
<tr>
<td>5.11</td>
<td>Transient behavior of the VCO</td>
<td>160</td>
</tr>
<tr>
<td>5.12</td>
<td>Top-level block schematic of the frequency divider</td>
<td>161</td>
</tr>
<tr>
<td>5.13</td>
<td>Timing diagram of the division principle</td>
<td>161</td>
</tr>
<tr>
<td>5.14</td>
<td>Timing diagram of the tri-modulus prescaler</td>
<td>163</td>
</tr>
<tr>
<td>5.15</td>
<td>Simplified schematic of the RF fixed-modulus prescaler</td>
<td>165</td>
</tr>
<tr>
<td>5.16</td>
<td>HF prescaler sensitivity</td>
<td>166</td>
</tr>
<tr>
<td>5.17</td>
<td>Simplified schematic of the phase-selector</td>
<td>167</td>
</tr>
<tr>
<td>5.18</td>
<td>Phase-rotator driver</td>
<td>168</td>
</tr>
<tr>
<td>5.19</td>
<td>Timing diagram of the frequency divider</td>
<td>170</td>
</tr>
<tr>
<td>5.20</td>
<td>PLL chip micrograph</td>
<td>171</td>
</tr>
<tr>
<td>5.21</td>
<td>VCO control voltage response to a step frequency</td>
<td>174</td>
</tr>
<tr>
<td>5.22</td>
<td>PLL phase noise</td>
<td>175</td>
</tr>
<tr>
<td>6.1</td>
<td>RF front-end block schematic</td>
<td>178</td>
</tr>
<tr>
<td>6.2</td>
<td>LNA simplified schematic</td>
<td>180</td>
</tr>
<tr>
<td>6.3</td>
<td>Equivalent half-circuit schematic of the LNA input</td>
<td>182</td>
</tr>
<tr>
<td>6.4</td>
<td>Wideband matching principle</td>
<td>185</td>
</tr>
<tr>
<td>6.5</td>
<td>Miller effect on the input quality factor</td>
<td>186</td>
</tr>
<tr>
<td>6.6</td>
<td>Small signal equivalent schematic of the cascode LNA</td>
<td>188</td>
</tr>
<tr>
<td>6.7</td>
<td>Cascode LNA noise calculation overview</td>
<td>191</td>
</tr>
<tr>
<td>6.8</td>
<td>Channel thermal noise contribution of the CS stage</td>
<td>193</td>
</tr>
<tr>
<td>6.9</td>
<td>MOSFET channel thermal noise model</td>
<td>195</td>
</tr>
<tr>
<td>6.10</td>
<td>Equivalent schematic for the induced gate noise</td>
<td>196</td>
</tr>
<tr>
<td>6.11</td>
<td>Short-channel $g_m$ and $g_{do}$</td>
<td>199</td>
</tr>
<tr>
<td>6.12</td>
<td>Channel thermal noise of the common gate transistor</td>
<td>199</td>
</tr>
<tr>
<td>6.13</td>
<td>Extracted ratio $K_g$</td>
<td>201</td>
</tr>
<tr>
<td>6.14</td>
<td>LNA’s differential transfer function</td>
<td>206</td>
</tr>
<tr>
<td>6.15</td>
<td>LNA’s differential input matching</td>
<td>208</td>
</tr>
<tr>
<td>6.16</td>
<td>Measured noise figure of the UWB LNA</td>
<td>209</td>
</tr>
<tr>
<td>6.17</td>
<td>Measured third-order nonlinearity</td>
<td>210</td>
</tr>
<tr>
<td>6.18</td>
<td>Measured second-order nonlinearity</td>
<td>211</td>
</tr>
<tr>
<td>6.19</td>
<td>Simplified schematic of the mixer</td>
<td>215</td>
</tr>
<tr>
<td>6.20</td>
<td>Optimization of the LO switch transistors</td>
<td>218</td>
</tr>
<tr>
<td>6.21</td>
<td>RF front-end measurements</td>
<td>220</td>
</tr>
<tr>
<td>6.22</td>
<td>RF front-end linearity measurements</td>
<td>222</td>
</tr>
<tr>
<td>6.23</td>
<td>RF front-end chip micrograph</td>
<td>224</td>
</tr>
<tr>
<td>6.24</td>
<td>Estimation of the GBW of a single cell</td>
<td>225</td>
</tr>
<tr>
<td>6.25</td>
<td>Typical VGA transfer function with AGC</td>
<td>227</td>
</tr>
<tr>
<td>6.26</td>
<td>AGC loop block diagram</td>
<td>228</td>
</tr>
</tbody>
</table>
6.27 Detection principle used in the AGC loop
6.28 Simulated square law detector transfer function
6.29 Step response of an AGC loop with square-law detection
6.30 Exponential gain law principle
6.31 Quasi-exponential transconductance law
6.32 Active load principle
6.33 VGA chip photograph
6.34 VGA simplified schematic
6.35 Measured VGA gain transfer functions
6.36 Measured and simulated VGA gains at 150 MHz
6.37 Measured VGA cut-off frequencies vs. gain
6.38 SNR degradation due to a strong nonlinearity
6.39 Measured VGA linearity
6.40 Measured VGA noise figures
6.41 Measured VGA amplitude and phase I/Q imbalance
6.42 AGC static and dynamic behavior

7.1 Simplified block schematic of the baseband ASIC
7.2 Block schematic of the baseband ASIC’s analog section
7.3 Block schematic of the I&D function
7.4 Simplified schematic of the track-and-latch comparator
7.5 Digital synchronization section
7.6 Erroneous synchronization situations
7.7 Detection by vector accumulation
7.8 Determination of M in AWGN only
7.9 Determination of M at the receiver’s sensitivity level
7.10 Optimum choice of M with respect to SNR
7.11 Timing diagram of the synchronization
7.12 Baseband ASIC chip photograph
7.13 BER test setup
7.14 Measured RF IR-UWB transmitted signals
7.15 Measured BER performance and prediction
7.16 ASIC’s behavior after initial acquisition
7.17 Sensitivity degradation due to free-running VCO

B.1 Typical architecture of a non-coherent UWB transceiver
B.2 Equivalent schematic of the cross-coupled oscillator
B.3 Transfer function of differential pair
B.4 Envelope function A(t) for different initial conditions
B.5 Simplified circuit schematic of the wavelet generator
B.6 Simulated and measured differential outputs
<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>B.7</td>
<td>Calculated PSD of the measured and simulated signal</td>
<td>308</td>
</tr>
<tr>
<td>B.8</td>
<td>UWB wavelet generator micrograph</td>
<td>310</td>
</tr>
<tr>
<td>C.1</td>
<td>Type of IR-UWB transmitter considered</td>
<td>315</td>
</tr>
<tr>
<td>C.2</td>
<td>Ideal current inverter formed by a high-pass LC-network</td>
<td>316</td>
</tr>
<tr>
<td>C.3</td>
<td>Balun principle</td>
<td>317</td>
</tr>
<tr>
<td>C.4</td>
<td>Simplified schematic of the output stage</td>
<td>318</td>
</tr>
<tr>
<td>C.5</td>
<td>Approximation of $i_{neg}(s)$</td>
<td>319</td>
</tr>
<tr>
<td>C.6</td>
<td>Exact and approximated output stage’s transfer function</td>
<td>321</td>
</tr>
<tr>
<td>C.7</td>
<td>Numerical example for lower band UWB applications</td>
<td>323</td>
</tr>
<tr>
<td>C.8</td>
<td>Fractional bandwidth, gain &amp; ripple of the output stage</td>
<td>324</td>
</tr>
<tr>
<td>C.9</td>
<td>Simplified schematic of the output stage</td>
<td>326</td>
</tr>
<tr>
<td>C.10</td>
<td>Chip photograph of the IR-UWB output stage prototype</td>
<td>327</td>
</tr>
<tr>
<td>C.11</td>
<td>Measurement setup for mixed-mode S-parameters</td>
<td>329</td>
</tr>
<tr>
<td>C.12</td>
<td>Measured and simulated $S_{3d}$ transfer function</td>
<td>330</td>
</tr>
<tr>
<td>C.13</td>
<td>Measured and simulated $S_{3d}$ group delay</td>
<td>331</td>
</tr>
<tr>
<td>C.14</td>
<td>Measured and simulated imbalance of the balun function</td>
<td>332</td>
</tr>
<tr>
<td>C.15</td>
<td>Linearity of the output stage (HD)</td>
<td>333</td>
</tr>
<tr>
<td>C.16</td>
<td>Linearity of the output stage (IM)</td>
<td>334</td>
</tr>
<tr>
<td>C.17</td>
<td>Simulated input and output power spectral densities</td>
<td>335</td>
</tr>
<tr>
<td>D.1</td>
<td>Block schematic of the self-calibrated oscillator approach</td>
<td>338</td>
</tr>
<tr>
<td>D.2</td>
<td>Measured open loop oscillator frequency accuracy</td>
<td>340</td>
</tr>
<tr>
<td>D.3</td>
<td>Measured output spectrum of the open loop VCO</td>
<td>341</td>
</tr>
<tr>
<td>D.4</td>
<td>Settling time comparison</td>
<td>342</td>
</tr>
</tbody>
</table>
List of Tables

2.1 ECC’s maximum equivalent isotropically radiated power 14
3.1 Path loss of the IEEE 802.15.4a channel model . . . . . . . . . . . . . . . . . . . . . 34
3.2 Coherence bandwidths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3 Signal fluctuations in multipath channel . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Transmitted signal parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5 Unresolved realizations due to IPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.6 Estimated link budget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.7 Summary of the transceiver implementation losses . . . . . . . . . . . . . . . . . . . . . 90
3.8 Principal potential interfering radio systems . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.1 State-of-the-art low data rate transmitter comparison . . . . . . . . . . . . . . . . . . . 136
4.2 TX power consumption overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.1 Summary of the PLL specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.2 Division modes for Tx and Rx configurations . . . . . . . . . . . . . . . . . . . . . . . . 162
5.3 Logic signal levels configuring the divider . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.1 LNA performances summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
6.2 VGA performances summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
7.1 Comparison of existing IR-UWB baseband systems . . . . . . . . . . . . . . . . . . . . 281
7.2 Receiver performance summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
B.1 Performances of the wavelet generator and comparison . . . . . . . . . . . . . . . . . . 310
C.1 Numerical investigation of worst-case process variations . . . . . . . . . . . . . . . . . 327
D.1 Timings and power cons. for AD/DA freq. calibration . . . . . . . . . . . . . . . . . . . 343
Abstract

Ultra-Wideband (UWB) is a promising technology for short-range and low-power indoor data communications. The recent interest in this technology was initiated in February 2002 in the United States. The amendment of the spectrum policies by the Federal Communication Commission (FCC) allowed the use of a radio signal occupying a bandwidth in excess of 500 MHz within the frequency range between 3.1 and 10.6 GHz. This opened the way to two main classes of wireless applications: 1) high-speed wireless aiming at transmission rates above 100 Mb/s for multimedia applications, and 2) low-complexity radio systems with high integration level intended for low-power applications such as wireless sensor networks (WSN). This work focuses on the second class of applications. Ultra-Wideband has been recognized as an interesting candidate for small portable devices transmitting data at faster rates and lower power consumption than the existing short-range wireless Standards such as Bluetooth or ZigBee. Following the United States, the Electronic Communications Committee (ECC) in Europe finalized a decision in February 2007, which clearly states that the 6-8.5 GHz frequency range is the preferred band for long-term operation of FCC-like UWB devices.

The goal of this project was the development of a standalone wireless integrated transceiver using the promising Impulse-Radio Ultra-Wideband (IR-UWB) technology. The benefits and the limitations of this technology were first thoroughly investigated. We specified a transceiver that could easily be implemented with the minimum loss of performance with respect to an optimum transceiver. Investigations showed that the main limitation comes from the characteristics of the indoor channel. The latter suffers from the multipath effect that induces fading and inter-symbol interferences. In low-complexity transceivers, where no extensive signal processing can be applied, there is an interest to use the multipath rather than to mitigate it. The principle of “energy-collection” is thus applied to the proposed receiver. The second principle of diversity is based on the use of frequency multiple access (carrier-based IR-UWB). For the class of devices targeted in this work (single antenna), the frequency diversity is also the only diversity strategy for enabling a reliable communication link. If the wireless devices are stationary or seldom moving, time diversity is not a reliable option, while small size rules out spatial diversity (eg. multiple antennas).
The main part of this thesis deals with the development of integrated circuits for an IR-UWB transceiver operating between 3 and 5 GHz. The integrated circuits were all realized using a commercial 0.18 µm CMOS technology. The heart of the transceiver is a wideband voltage-controlled oscillator (VCO) that can be both used in closed-loop (PLL) or in free-running mode. The latter solution involves a self-calibration procedure using the PLL and a digital-to-analog conversion of the control voltage of the VCO. It yields an accuracy of ±20 MHz, which is sufficient to meet the requirements of our transceiver. Several other solutions have also been investigated to generate a carrier-based IR-UWB signalling, such as a simple switched oscillator or a pulse shaping stage at radio-frequencies. Finally, a single-chip transmitter using BFSK modulation and offering three channels between 3.3 and 4.8 GHz is demonstrated.

The RF front-end receiver features a 3-5 GHz UWB LNA, which feeds quadrature mixers for frequency down-conversion. A variable gain amplifier has also been developed specifically for our application. It provides 60 dB of voltage gain over a bandwidth of 180 MHz. This circuit also features an automatic gain control that compensate transmitter-receiver range variations and multipath fading effects. The back-end section consists of a mixed-mode application-specific integrated circuit (ASIC), whose function is to demodulate the down-converted quadrature signal into a bipolar pulsed signal. The resulting analog signal is integrated over a duration of 20-40 ns and converted into a digital signal by means of a 1-bit A/D conversion (comparator). The digital section of the baseband chip also synchronizes on the incoming pulse stream and provides the corresponding digital data. The baseband processor is also able to provide an estimation of the signal-to-noise ratio without any information from the received signal strength. This enables an assessment of the link quality prior any further processing or data storage.

The entire IR-UWB link exhibits a communication range of 10 m at 5 Mb/s in free-space without any error correction nor any other coding schemes. This corresponds to a receiver sensitivity of -83.7 dBm for a bit error rate of $10^{-2}$. The power consumption of the entire receiver in tracking mode (after synchronization) at 5 Mb/s reaches 36 mW.
Résumé

L'Ultra-Wideband (UWB, ou à Ultra-Large Bande, ULB) est une technologie prometteuse pour les communications sans fil à courte distance à l’intérieur des bâtiments. L’intérêt récent porté à cette technologie a été initié aux États-Unis en février 2002. La modification de la régulation concernant les émissions électromagnétiques a permis l’utilisation de signaux radio à large bande (> 500 MHz) sur une plage de fréquence s’étendant de 3.1 à 10.6 GHz. Cette modification a permis l’apparition deux nouveaux types d’application sans fil: 1) les applications visant des taux de transfert supérieurs à 100 Mb/s (multimédia), et 2) les systèmes radio à complexité réduite, utilisant des hauts niveaux d’intégration et ciblant des applications de réseaux de capteurs à basse consommation (wireless sensor networks, WSN). Ce travail se concentre essentiellement sur cette seconde classe d’applications; l’UWB a été identifié comme une technologie à fort potentiel pour les dispositifs portables de petite taille, délivrant des taux de transfer supérieurs à Bluetooth ou ZigBee, tout en maintenant des niveaux de consommation énergétique plus faibles. En février 2007, l’Europe, par le bias du Comité des Communications Électroniques (ECC), emboîtait le pas aux État-Unis en adoptant une décision concernant l’utilisation de l’UWB dans la Communauté. Cette décision établit clairement que la bande de fréquence qui sera favorisée à long terme pour des applications compatibles avec la FCC devra se situer entre 6 et 8.5 GHz.

Le but de ce projet fut le développement d’un émetteur-récepteur intégré sans fil et autonome, basé sur la technologie UWB à impulsions (IR-UWB). Les avantages et les inconvénients de cette technologie ont tout d’abord été minutieusement étudiés. Le travail s’est concentré sur l’élaboration d’une spécification permettant une intégration facilitée et résultant en un minimum de pertes par rapport à une solution optimale théorique. Les études ont montré que la limitation principale est liée aux caractéristiques du canal de transmission. Ce dernier est en effect affecté par de fortes distortions issues des multiples chemins de propagation (multipath) et des réflexions de l’onde transmise à l’intérieur des bâtiments. Dans les systèmes à complexité réduite, aucun traitement de signal sophistiqué n’est envisagé. En conséquence, il est préférable de travailler avec les imperfections liées au canal plutôt que de les combattre. Le principe de “collection d’énergie” est donc appliquée au niveau du récepteur; l’idée étant d’accumuler l’énergie reçue
des différents chemins de propagation en intégrant le signal arrivant à l’antenne du récepteur. Le second principe de diversité appliqué dans ce récepteur est basé sur l’utilisation du multiplexage de fréquences. Pour des dispositifs sans fil n’utilisant qu’une seule antenne, la diversité de fréquence est la seule stratégie permettant l’amélioration de la fiabilité de la communication. En effet, pour des applications à faible mobilité, la diversité temporelle n’est pas envisageable, tandis que la taille du récepteur exclut la diversité spatiale (antennes multiples).

La tâche principale a été le développement d’une série de circuits intégrés pour une application sans fil UWB entre 3 et 5 GHz sur la base d’une technologie CMOS 0.18 µm. Le cœur de l’émetteur-récepteur est un oscillateur à large bande contrôlé en tension (VCO). Ce dernier peut être utilisé en boucle fermée (avec une boucle à verrouillage de phase, PLL) ou en oscillation libre. Cette dernière solution implique la mise en œuvre d’une calibration basée sur une PLL et une conversion digitale/analogique de la tension de control du VCO. Cette méthode permet une réduction drastique de la consommation tout en maintenant une précision en fréquence suffisante ($<\pm20$ MHz). Plusieurs autres solutions ont été étudiées pour la génération de signaux impulsionnels UWB (oscillateur à démarrage rapide, étage de sortie filtrant). Finalement, un émetteur totalement intégré à modulation de fréquence et offrant trois canaux a été démontré.

Le récepteur de radio-fréquence est quant à lui constitué d’un amplificateur à faible bruit (LNA) alimentant une paire de mélangeurs à quadrature. Un amplificateur à gain variable a également été développé pour la bande de base. Il fournit un gain en tension de 60 dB sur une bande passante de 180 MHz. Ce circuit possède aussi un contrôle de gain automatique compensant les variations de signal. Le traitement final du signal est assuré par un circuit intégré spécifique (ASIC). Sa fonction consiste à démoduler, détecter et convertir le signal analogique en un signal digital (données et synchronisation). De plus, un processeur à faible complexité est capable de fournir une estimation du rapport signal-sur-bruit sans latence et sans connaissance à priori de l’amplitude du signal reçu. Ceci permet l’estimation de la qualité de la liaison avant un traitement additionnel ou un stockage des données.

La liaison ainsi réalisée permet une communication sur une distance de 10 m à un taux de transfert de 5 Mb/s en champ libre sans aucune correction d’erreur ou codage. Ceci correspond à une sensibilité de réception de -83.7 dBm (probabilité d’erreur de $10^{-2}$). La consommation totale du récepteur est de 36 mW.
1

Introduction

1.1. Motivation and Background

Impulse-Radio Ultra-Wideband (IR-UWB) is not a new wireless technology, it’s... the oldest. The first experiments using short electromagnetic pulses date back to the end of the 19th century, when Hughes, Hertz and Marconi managed to transmit and receive this kind of signal over the air with a spark-gap generator and tuned detector. During the last 30 years, pulse-based UWB was rather pushed ahead for military applications (radars). Since the beginning of the millennium, UWB has been revitalized for wireless communications over short distances with the adoption of new spectrum policies.

UWB mostly gained interest owing to the huge bandwidth and the strong potential for very high speed wireless communications. However, the available bandwidth also makes UWB attractive owing to some other properties: low complexity, low power and robustness. These are central motivations for this work. Low complexity and low power are somewhat related. Wide available bandwidths make possible the use of “ill-defined” spectral characteristics and frequencies, which do not require excessive calibration and accuracy at the transmitter and the receiver. The idea behind that is to minimize the time needed to calibrate the transceiver and consequently to reduce the overall power consumption. On the other hand, large bandwidths may preclude to use of energy efficient resonant circuits and increase the current consumption to reach sufficient gain.

Wide bandwidths also provide robustness for communication links, especially in dense echoed and interfered environments. In small devices
using a single antenna, frequency is actually the only source of diversity. Multipath fading can be mitigated in communication links using wideband signals and the selection of an appropriate frequency band is a way to avoid interferences. The main limitation of UWB communication is the presence of in-band interferences that can easily saturate the UWB receiver front-end.

**Figure 1.1.** UWB applications

In this work, we investigate an integrated IR-UWB solution for portable wireless devices using moderate bitrate in the order of 1-10 Mb/s, over a range of 10 m. The small form factor targeted in this work restricts the use of multiple antennas, bulky batteries and external components. As shown in Fig. 1.1, these targets fit the needs for a class of communicating devices that may fill the gap existing between Bluetooth, WLAN and high speed multimedia applications.

1.2. Ultra-Wideband: One Definition ?

Ultra-Wideband (UWB) is a generic term that does not exactly describe the type of signals or modulation used. Historically, UWB has referred to radio devices employing signals having a -10 dB fractional bandwidth greater than 25%, or an absolute bandwidth greater than 1.5 GHz. In Chapter 2, a revised definition adopted in 2002 will be
given. From the point of point of view of UWB signals, one can distin-
guish between two main classes:

- **Impulse-radio UWB (IR-UWB):** this term was traditionally
  used to define radio devices using short, low-duty cycle, baseband
  generated electrical impulses. These systems do not use any car-
  rier to up-convert the signal towards radio-frequencies (RF) and
  are also called “carrierless” systems. The difficulty lies in the con-
  trol of the exact shape of the impulse and its center frequency.
  However, these solutions offer the best potential for ultra low-
  power radio devices;

- **Carrier-based UWB (CB-UWB):** this solution is closer to the
  well-known heterodyne or homodyne radio systems, in which one
  or several carriers are modulated. These systems use frequency
  up- and down-conversion and possess a better spectral accuracy,
  which makes the compliance of the generated signal to regulations
  much easier (see Section 2.2 and 2.3).

In this project, an hybrid solution that can be seen as a “carrier-based
impulse-radio UWB” is proposed. The motivation was first to take the
best of both worlds, that is the low-power advantage of IR-UWB and
the benefits of frequency-agile CB-UWB transceivers.

1.3. KTI Research Project

This project was supported by the Federal Office for Professional
Education and Technology (OPET), contract/grant number: KTS-
6322.1 [1] and was officially started on April 2003.

The basic goal of this work is the development of a radio-frequency
integrated circuit (RFIC) for moderate data rates using radio signals
that meet the UWB definition. More particularly, this project focuses
on the development of a fully integrated front-end using commercial
Complementary Metal Oxide Semiconductor (CMOS) technology for
compact and portable communication devices with low power con-
sumption. Targeted performance is a power consumption in the or-
der of 10mW for 1 Mb/s of raw data rate. The minimal range with
uncoded transmission schemes (worst case in strong multipath indoor
environment) has been set to 3 meters. The potential applications of
this UWB transceiver are seen in small portable devices that require
wireless connectivity over a short range such as, for example, in sen-
sors, wrist-watches with multimedia capability (diary, MP3,...) or any
other devices forming or belonging to a Wireless Personal Area Network (WPAN). A overview of the main objectives are given hereafter:

1. moderate bit rate (1-10 Mb/s);
2. low power consumption (< 50 mW at maximum data rate);
3. robustness and reliability in dense multipath environment;
4. robustness against strong narrowband interferences (especially WLAN and WPAN such as Bluetooth);
5. extremely small form factor (battery-powered and single antenna device).

1.4. Content and Organization of the Thesis

This thesis describes the design and the implementation of several integrated circuits for carrier-based IR-UWB wireless communication. The main goal was to implement and characterize an entire communication link, targeting low power and low complexity, but providing sufficient reliability to operate as a WPAN device.

In Chapter 2, an overview of UWB in terms of worldwide regulation and adopted Standards is provided. UWB systems operate over very large bandwidths and necessitate restrictive rules to allow wireless systems to co-exist with each other. This section focuses mainly on the description of the first adopted regulation in the USA and on the amendment of the spectrum policy in Europe. An historical review on the development of a Standard for UWB by the IEEE is also given.

Transceiver planning is the topic of Chapter 3. The choice of a signalling scheme (modulation) and its performance in a dense multipath environment is investigated. This Chapter also concentrates on the choice of a demodulation technique at the receiver side. The demodulation should enable a sufficient bit error rate (BER) of typically $10^{-2}$ over a range of 10 m at a maximum data rate of 10 Mb/s.

Chapter 4 describes a fully integrated CMOS IR-UWB transmitter. This device is based on direct modulation and relies on a RF carrier, which can be generated by a PLL (Chapter 5) or a free-running voltage controlled oscillator (VCO), which is periodically calibrated. The calibration procedure makes use of the proposed PLL and is based on the digitization of the control voltage of the VCO (Appendix D).
1.5. Definitions: Impulse, Pulse and Burst

The IR-UWB receiver front-end is described in Chapter 6 and is based on direct down-conversion. A low-noise amplifier, a quadrature mixer and a variable gain amplifier are proposed. The demodulator and a digital baseband back-end circuit are given in Chapter 7. A synchronization algorithm has been implemented in the baseband to enable a stand-alone operation of the entire receiver. This work is concluded and summarized in Chapter 8.

1.5. Definitions: Impulse, Pulse and Burst

A definition of the most used expressions that define signals involved in UWB radio techniques is given hereafter:

**Impulse**: a surge of unidirectional polarity that is often used to excite a UWB band limiting filter whose output, when radiated, is a UWB pulse.

**Pulse**: a radiated short transient signal whose time duration is nominally the reciprocal of its UWB -10 dB bandwidth.

**Burst**: an emitted signal whose time duration is not related to its bandwidth.

Note that the denomination “Impulse-Radio Ultra-Wideband” (IR-UWB) does not restrict itself to the use of impulse signals, but rather includes the three aforementioned definitions. IR-UWB is a generic term including wideband radio techniques that are not based on the modulation or the transmission of a continuous wave (CW).

IR-UWB more generally encompasses signalling schemes using very small duty-cycles applied on one bit or symbol (impulses or pulses) or on several consecutive bits or symbols (bursts).
2.1. UWB History

The dominant method for wireless communications has always been the emission and the modulation of sinusoidal CW. The interest for carrier-less signalling begun in the 60’s, when time-domain electromagnetic was used to fully describe the transient behavior of a certain class of microwave networks through their characteristic impulse response. This section summarizes some of the important milestones that have enabled the development of the UWB technology. For a detailed historical review, the interested reader is referred to [2]:

- 60’s - Impulse-based signalling schemes investigated in the U.S. by H. F. Harmuth at Catholic University of America, G. F. Ross and K. W. Robbins at Sperry Rand Corporation, P. van Etten at the USAF’s Rome Air Development Center, and in the former Soviet Union by L. Y. Astanin, mainly for radar applications;


- 1973 - G. F. Ross and K. W. Robbins’ US Patent 3,728,632 [5] is a landmark patent in UWB communications. This patent recognized the utility in spread spectrum systems of a wide instantaneous bandwidth (as opposed to sequential bandwidth);
• 1974 - First ground penetrating radar (GPR) shown by Morey, which was to become a commercial success at Geophysical Survey Systems, Inc. (GSSI);

• Until 1989 - Several publication under different designations such as impulse, carrier-free, baseband, time domain, nonsinusoidal, orthogonal function and large-relative-bandwidth radio and radar signals. The expression UWB appears in 1989 and is introduced by the Department of Defense (DoD) of the United State;

• 1998 - The telecommunication revolution increased the need for data rate for commercial applications. UWB is identified as a potential candidate to deliver very high data rates in a wireless manner. Many companies encouraged the U.S. government to take steps to regulate the intended emissions of such signals. Until 2002, UWB was regulated as unintended emissions under the FCC part 15.209 [6].

2.2. Regulations in the USA: FCC Amendment

In February 2002, the Federal Communication Commission (FCC) of the United States of America has approved the use of UWB technologies for commercial applications [7]. A “First Report and Order” [8] has been released in April 2002 to permit the operation and the marketing of certain types of new radio products using the UWB technology.

According to FCC, UWB signals shall have a -10 dB bandwidth either larger than 20% of the center frequency or larger than 500 MHz, whichever applies. No licenses are required for UWB wireless devices. For UWB communication devices, the frequency band between 3.1 GHz and 10.6 GHz is allocated. Since UWB will operate using spectrum occupied by existing radio services, the specifications contained in the FCC ruling are rather conservative, especially in terms of radiated power spectral density (PSD), which has been limited to 75 nW/MHz (or -41.3 dBm/MHz) in the allocated band. This low limit should avoid harmful interference with incumbent radio services. The 2009 revision of the legal text can be found in [9].

2.3. Activities and Regulation in Europe

Contrary to the FCC, which rapidly released an underlaying frequency band for UWB without undertaking extensive studies, Europe hesi-
tated several years before amending the regulation. This slower process has been mainly caused by a difference in the structure and in the philosophy. FCC can be seen as a single regulatory body having two tasks: 1) protecting incumbent services from interferers and 2) promoting innovations to make more valuable use of the spectral resource. In Europe, a large number of countries exist in close proximity and the regulatory body has not the same degree of freedom than in the U.S. The European model is based on consensus and mutual agreement.

2.3.1. Overview of European Regulatory Bodies

Before describing the development of UWB in Europe, it is worth depicting the somewhat intricate relationships and functions of the different entities that have been involved in the decisions of the European Union (EU). The main organizations are enumerated hereafter; these are also depicted in Fig. 2.1 with other international regulation and standardization bodies such as ITU-R, Ecma and IEEE, which will be presented in Sections 2.4, 2.6.3 and 2.6.1, respectively:

- EC: European Commission, which is the executive body of EU and mandates ETSI and CEPT/ECC;

- RSC: Radio Spectrum Committee, whose members are the national administrations. This body makes sure that the policies of the EC effectively represents the interests of the member states;

- CEPT [10]: European Conference of Postal and Telecommunications Administrations is a supervising body that has been created by international treaty for the purpose of harmonizing work about spectrum regulation in telecommunications among the member countries of the EU;

- ECC: European Communication Commission, which can be seen as the engineering arm of the CEPT, is responsible for technical aspects of the regulation, such as spectrum allocation, licensing and compliance requirements;

- ETSI: European Telecommunication Standard Institute, which is usually in charge of defining Standards for the EU.
2.3.2. The First Mandate

The first phase of the UWB study has been performed within working groups associated with ETSI. In parallel, workshops and study groups have been organized by CEPT. These groups have evaluated how to accommodate UWB-based radio services within the frequency range of 1-40 GHz.

In 2004, the CEPT received a mandate [11] from the EC through the RSC to undertake all the necessary work to identify the most appropriate criteria for the introduction of UWB applications in the EU. Within the CEPT, the ECC established a Task Group (TG3) to develop the responses and completes the technical studies on the development of UWB in Europe. This work applies to generic radio devices below 10.6 GHz that are exempt from individual licensing and operate on a non-interference, non-protected basis. That means that no harmful interference may be caused to any other radiocommunication service and that no claim may be made for protection of these UWB devices against harmful interference originating from other radiocommunication services.

More particularly, this Task Group targeted primarily at applications for data transmission, such as high data rate WPAN (HDR: 50 to 500 Mbits/s), very high data rate WPAN (VHDR: above 500 Mbits/s) mainly deployed in indoor environments and low data rates (LDR: typically below 1 Mbit/s) with localization tracking abilities.
2.3.3. The Extremely Conservative ECC Report 64

In February 2005, ECC issued a technical report - ECC Report 64 [12] - that has considered the protection requirements of radiocommunications services below 10.6 GHz from generic UWB applications. The study, based mostly on theoretical analysis, gives conclusions on available data concerning UWB technical characteristics and propagation models. The conclusions have been considered without specific mitigation techniques applied on UWB devices to reduce the amount of interference on existing services. It should be noted that not all frequency bands which are allocated to the radiocommunications services within the frequency band of interest have been considered in this report.

Based on the deployment scenarios and protection distances assumed in the studies of the ECC Report 64, the majority of the radiocommunication services considered require up to 20-30 dB more stringent generic UWB PSD limits than the FCC density limits adopted in the United States earlier in 2002. If the concerned radiocommunication service is operated in an outdoor environment, as is the case for example in Fixed Services (FS), Fixed Satellite Services (FSS), Radio Astronomy Services (RAS), Earth Exploitation Satellite Services (EESS), then the increase of noise due to the aggregate UWB interference, generally determines the generic UWB PSD limit. The most affected bands are used by RAS (a few MHz around 3.3 GHz and especially a 200 MHz band around 4.9 GHz), which require up to 50-80 dB more stringent limits. If the affected radiocommunication service is operated in the indoor environment, e.g. Digital Audio and Video Broadcasting (DAB & DVB), Mobile Services, Radio LAN etc., then the closest UWB interferer becomes the dominant interference factor due to small spatial separation (small path loss).

It was recognized that regulatory solutions based on the maximum generic UWB PSD limits calculated in ECC Report 64, while protecting existing services with a high degree of confidence, would not facilitate the deployment of UWB operations in Europe. Even more, it has been concluded that this report has led to requirements which are too stringent to allow feasible operation of UWB in Europe! Based on this report, an answer on the first mandate had been issued [13].

2.3.4. The 2\textsuperscript{nd} and 3\textsuperscript{rd} Mandate: Mitigation Techniques

Further analyzes have been performed within the frame of a second mandate issued by the European Commission to CEPT in June 2005
to investigate a less conservative and less theoretically-based recommendation.

The mandate included in particular an impact analysis considering an uniform PSD limit of -55 dBm/MHz in the 3.1-10.6 GHz frequency range. It also considered the implementation of mitigation scenarii to further reduce or cancel the potential interferences of UWB devices on incumbent services (such as radars between 2.7 and 3.4 GHz). These mitigation techniques typically are

1. low duty-cycle (LDC) operations;
2. detect-and-avoid (DAA) mechanisms and
3. restriction to indoor UWB applications.

For instance, the consequence of the LDC restriction on the choice of a solution for communicating devices using UWB is discussed in Section 2.3.7. As a conclusion, the second report has however relaxed the quite negative outlooks implied by the ECC Report 64.

The TG3 recommendations that came out of the second mandate included several provisions for which analysis had not been conducted, such as the use of UWB on road or rail vehicles, and in cars and aeroplanes. The third mandate’s goal was to make further investigations on these aspects.

2.3.5. The European Decision: ECC/DEC/(06)04

In March 2006, CEPT reached an important milestone by issuing an ECC Decision [15]. This Decision is mainly based on ECC Report 64 and the second CEPT report. This decision recognizes that the UWB technology holds potential for a wide variety of new Short Range Devices (SRD) for communications, measurements, location tracking, imaging, surveillance and medical systems.

The main technical requirements for devices using the UWB technology in bands below 10.6 GHz are summarized hereafter:

- ECC has decided to allow UWB activity on the upper band (6 to 8.5 GHz) with a FCC-compatible PSD level of -41.3 dBm/MHz. No particular restriction has been set on this band (either DAA or LDC);
- the lower bands below 5 GHz (i.e. the bands containing the ones chosen for this work) are limited by a maximum mean equivalent
isotropically radiated power (EIRP) of -80 dBm/MHz between 3.4 and 3.8 GHz, and -70 dBm/MHz between 3.1-3.4 GHz and 3.8-4.8 GHz.

- however, ECC has considered a separate Decision covering the frequency band 3.1-4.8 GHz, within which LDC operations are permitted with a PSD limit of -41.3 dBm/MHz (see next Section), the band 4.2-4.8 GHz had been available without any restriction only until the end of 2010;
- the devices permitted under this ECC Decision are exempt from individual licensing and operate on a non-interference, non-protected basis;
- devices covered by the scope of this ECC Decision are not allowed to be used at a fixed outdoor location or connected to a fixed outdoor antenna.

At the end of 2006, the European Community published a draft document which has lead, in February 2007, to a final decision “on allowing the use of the radio spectrum for equipment using ultra-wideband technology in a harmonized manner in the Community” [16]. The most important differences with the FCC rule lie in the minimum bandwidth, which is fixed to 50 MHz, and a much smaller available bandwidth. This document has confirmed that the frequency range from 6.0 to 8.5 GHz is the preferred band for long-term operation of FCC-like UWB radio devices, i.e. a maximum of -41.3 dBm/MHz mean and 0 dBm/50 MHz peak EIRP. Within this bandwidth, no interference mitigation is technique is required.

2.3.6. The Amendment to the Final Decision

In October 2008, ECC has issued an amendment to the final decision [17] to allow operation of UWB devices at maximum mean EIRP level of -41.3 dBm/MHz in the bands 3.1-4.8 GHz and 8.5-9 GHz. Within these two bands, ECC has investigated DAA and LDC mitigation techniques in order to ensure protection of Broadband Wireless Access (BWA, such as IEEE802.16 WiMAX) and applications in radio-location services.

The decision regarding the spectrum mask is summarized in Table 2.1 and depicted for comparison with the US FCC regulation in Fig. 2.2.
**Table 2.1.** Maximum equivalent isotropically radiated power (EIRP) limits fixed in the European Community for indoor UWB-RT (as per October 2008, according to ECC/DEC/(06)04 and amendment ECC/DEC/(06)12).

<table>
<thead>
<tr>
<th>Frequency Range [GHz]</th>
<th>Max. mean EIRP [dBm/MHz]</th>
<th>Max. peak EIRP [dBm/50 MHz]</th>
</tr>
</thead>
<tbody>
<tr>
<td>below 1.6</td>
<td>-90</td>
<td>-50</td>
</tr>
<tr>
<td>1.6 - 2.7</td>
<td>-85</td>
<td>-45</td>
</tr>
<tr>
<td>2.7 - 3.4</td>
<td>-70</td>
<td>-36</td>
</tr>
<tr>
<td>3.4 - 3.8</td>
<td>-80</td>
<td>-40</td>
</tr>
<tr>
<td>3.8 - 4.2</td>
<td>-70</td>
<td>-40</td>
</tr>
<tr>
<td>4.2 - 4.8</td>
<td>-70/-41.3*</td>
<td>-30/0*</td>
</tr>
<tr>
<td></td>
<td></td>
<td>for UWB devices placed on</td>
</tr>
<tr>
<td></td>
<td></td>
<td>the market before 31st Dec. 2010</td>
</tr>
<tr>
<td>3.1 - 4.8</td>
<td>-41.3</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>for LDC or DAA UWB devices</td>
</tr>
<tr>
<td>4.8 - 6.0</td>
<td>-70</td>
<td>-30</td>
</tr>
<tr>
<td>6.0 - 8.5</td>
<td>-41.3</td>
<td>0</td>
</tr>
<tr>
<td>8.5 - 9.0</td>
<td>-41.3</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>for DAA UWB devices</td>
</tr>
<tr>
<td>8.5 - 10.6</td>
<td>-65</td>
<td>-25</td>
</tr>
<tr>
<td>above 10.6</td>
<td>-85</td>
<td>-45</td>
</tr>
</tbody>
</table>

### 2.3.7. Some Considerations on Mitigation Techniques

Before the advent of UWB, the peaceful coexistence of wireless systems was mostly based on frequency multiplexing and, to a lesser extend, spatial separation. UWB is referred to as “spectrum underlay” technology and therefore no longer follows these schemes. Although it was recognized that the frequency range below 4.8 GHz has advantages owing to the lower path loss and the near term availability of the technology, the European regulation body has considered that additional requirements in terms of protection of other services was needed below 4.8 GHz. ECC introduced two new concepts in the regulations, the LDC and DAA mechanisms. This was of a particular concern for our solution since this band is contained in the band chosen for this project.
2.3. Activities and Regulation in Europe

**Detect-and-Avoid (DAA):** To use spectrum below 4.8 GHz, a UWB device must listen for operational services using the channel. If a signal is detected, the UWB device must either modify its emission to avoid interfering or must abandon the spectrum. The DAA mechanism protects BWA applications in the band 3.4-4.2 GHz and radio-location application within bands 3.1-3.4 GHz and 8.5-9 GHz (ECC Report 120 [18]). This mechanism has not been investigated in this work, since appeared during the development of the integrated circuit. It has later appeared in literature that DAA suffers from difficulties in the implementation. The first challenge is to distinguish incumbent transmissions from normal environmental noise. A second challenge that exists is that the detection mechanism needs to be fairly specific to the service to detect.

**Low Duty-cycle (LDC):** A device employing LDC limits itself to being on the air during a limited period of time. LDC does not require any monitoring of the radio environment to decide whether UWB signal can be transmitted at the maximum PSD. LDC was introduced to further protect WiMAX BWA terminals from low-complexity LDR UWB devices operating in the band 3.4-4.2 GHz. The technical requirements in terms of LDC for devices allowed to operate up to a level up to -41.3 dBm/MHz had been first defined in Annex I of ref. [19] and uphold in ECC Report 94 [20]. The set of requirements for this kind of UWB device is enumerated hereafter:

- $T_{\text{on, max}} = 5 \text{ ms}$; maximum duration of a continuous train of pulses (burst), irrespective of the number of pulses in the burst;

- $T_{\text{off, mean}} \geq 38 \text{ ms}$; average time interval between bursts (UWB emission idle);

- $\Sigma T_{\text{off}} > 950 \text{ ms per second}$;

- $\Sigma T_{\text{on}} < 50 \text{ ms per second (5%/sec) and 18 seconds per hour (0.5%/hr)}$ (“double limitation”);

Assuming a system emitting at a pulse repetition rate (PRR) of 10 MHz, this limits the maximum pulse throughput to 500 kp/s (i.e. half a Mb/s for an uncoded transmission). From the data rate point of view, this kind of device may be able to compete with existing LDR standards such as Bluetooth or ZigBee if and only if the LDC requirements
Chapter 2: Ultra-Wideband Radio Technologies

are accompanied with very efficient power reduction schemes. This constraint on efficient powering schemes fits well the impulse-radio UWB (IR-UWB) technology characteristics, which can be “duty cycled” by nature.

![UWB spectrum masks adopted by the Federal Communication Commission (FCC) in the United States (solid) and by the European Community under the Decisions ECC/(06)/04 and ECC/(06)/12 (dotted and dash-dotted, respectively) for intentional UWB emissions. The limits shown correspond to the maximum mean equivalent isotropically radiated power (e.i.r.p.) density in dBm/MHz. The gray shaded area represents the frequency band chosen for this work; intentional emissions cover the frequency band between 3.1 and 5 GHz, whereas unintentional out-of-band emissions are governed by ECC regulation mask.](image)

**Figure 2.2.** UWB spectrum masks adopted by the Federal Communication Commission (FCC) in the United States (solid) and by the European Community under the Decisions ECC/(06)/04 and ECC/(06)/12 (dotted and dash-dotted, respectively) for intentional UWB emissions. The limits shown correspond to the maximum mean equivalent isotropically radiated power (e.i.r.p.) density in dBm/MHz. The gray shaded area represents the frequency band chosen for this work; intentional emissions cover the frequency band between 3.1 and 5 GHz, whereas unintentional out-of-band emissions are governed by ECC regulation mask.

2.4. ITU-R Task Group 1/8

The use of the radio spectrum is regulated domestically by the spectrum management authority of each country (e.g., FCC in the U.S.) or group of countries (e.g., ETSI in Europe). To keep a certain homogene-
ity between countries or entities, the International Telecommunications Union (ITU-R) publishes recommendations to provide guidance to administrations. For that purpose, the ITU-R Task Group 1/8 has been created in January 2003 to address the various issues occurring with the regulatory and the technical aspects of UWB [21,22]. For instance, the main concerns of TG 1/8 have been divided into four main topics investigated by Working Groups (WG):

- **WG1: UWB characteristics (ITU-R 227/1 and ITU-R 226/1).** This document include key technical and operational characteristics of UWB systems. Terminology and definitions for UWB are also investigated, as well as applicability of old definitions to UWB.

- **WG2: UWB compatibility (ITU-R 227/1).** This document includes methodologies to compute the effect of emissions from single and multiple UWB systems on other radiocommunication systems.

- **WG3: spectrum management recommendations (ITU-R 227/1).** This framework is intended as guidance to administrations considering the introduction of UWB systems.

- **WG4: recommendation on measurement techniques of UWB emissions (Decision 1/95).** UWB devices use pulsed emissions at a very low PSD’s to deliver large amount of data across several GHz of bandwidth. Thus appropriate measurement methods are needed for root mean square (RMS) and peak PSD. This is required for compliance with relevant standards and specifications and is useful for the purpose of spectrum monitoring.

**2.5. This Work vs. the Regulations**

The spectrum mask originally chosen for this work was a narrower band located on the lower end of the FCC mask. It has been agreed with the industrial partner to use the frequency range below the 5 GHz WLAN ISM band, which corresponds to a frequency band spreading out from 3.1 to 5 GHz (see gray shaded area in Fig. 2.2). This would ease the design of a low-power device and enable the use of commercial IC technology.

After the amendment of the European regulation for UWB wireless communication, we also targeted to show a compatibility of our
solution with the European rules. For intentional emissions, we kept the -41.3 dBm/MHz between 3.1 and 5 GHz, whereas for the out-of-band spurious emissions, values given in Table 2.1 have been adopted. The corresponding levels for spurious emissions are given in Fig. 2.2 by vanishing gray shaded areas on either side of the 3.1-5 GHz band.

Since most of the work focuses on the elaboration of a radio-frequency front-end, whose topology shows only a dependence on the chosen frequency bands, the first difficulty lay in the choice of a signalling scheme based on the very restrictive and conservative FCC regulations. It must be emphasized here that no UWB-standard was available at the beginning of the thesis and that some works have been undertaken to devise a simple “proprietary” standard that would enable the realization of an IR-UWB transceiver. To make this choice easier, it has been agreed with the industrial partner that the transceiver should have a low complexity. Especially on the receiver side, this work focuses on simple techniques that can however operate in dense multipath channels. The consequence of this decision has prevented the use of a more bandwidth efficient system, which is strongly related to the baseband part using extensive digital signal processing (such as equalization or Fourier transforms). Such topics are however not the main concern of this thesis.

Initially, one of the most constraining restriction was the proposal for the emitted spectrum measurement procedure, especially regarding swept frequency (FMCW), stepped frequency or frequency hopping spread spectrum (FHSS) systems. The measurement procedures required that measurement of swept frequency devices be made with the frequency sweep, -step, or -hop stopped [23]. This has actually precluded the use of these techniques to extend the bandwidth up to the UWB definition, even with a low data rate system. This point was the subject of harsh discussions between promoters of multiband frequency-based solutions and pulse-based UWB defenders. Since then, many applications strictly based on frequency modulation have been reported, such as FM-UWB [24], proposed by the Centre Suisse d’Électronique et de Microtechnique (CSEM), Neuchâtel, Switzerland.

The second issue at the initiation of this work was the lack of regulation in Europe. It was strongly believed at this time that Europe wouldn’t exactly follow the way the US government did, and thus, that a kind of flexibility should be foreseen in the overall system. Thus, some efforts have been made during the developments in trying to design the overall system for maximum compatibility with the likely non-FCC-
compatible ECC rules; the main motivation being the demonstration of a system satisfying essentially both the FCC and ECC rules in terms of frequency bands.

2.6. The Standardization Processes

A year after the FCC first adopted UWB rules, the standards debates began at the Institute of Electrical and Electronics Engineers (IEEE), which is an international organization serving industries with standards programs. Working Groups were established within the IEEE 802 Local and Metropolitan Area Network Standards Committee. This entity focused on the development of standards for Wireless Personal Area Networks (WPAN) or short range networks.

2.6.1. The IEEE 802.15 Working Group

With the FCC’s approval for commercial applications of UWB, much hope has been placed in UWB’s ability to support multimedia at very high speeds and efforts started to establish an industrial standard that covers these needs. This UWB standardization activity was concentrated in the 802.15 Working Group for Wireless Personal Area Networks [25] with the Task Group 3a (TG3a). Other activities of the 802.15 Working Group consisted in deriving the 802.15.1 Standard, a WPAN solution based on Bluetooth. The 802.15.2 also proposed recommendations to facilitate coexistence of WPAN (802.15) with WLAN (802.11). An overview of the 802.15 Working Group is given in Fig. 2.3.

2.6.2. High-speed UWB: the Failure of TG3a

In 2003, shortly after the UWB spectrum mask adoption, a new group from the IEEE 802.15, namely the High Rate Alternative PHY Task Group (TG3a), has been created to define a project to provide a higher speed physical layer for applications which involve imaging and multimedia. This Task Group was seen as an extension of the already existing IEEE 802.15.3. ¹

Due to the large allocated bandwidth, UWB has been identified as the unique solution to enable very high data rate below 10 GHz.

¹For instance, the IEEE 802.15.3 released in August 2003 a standard for wireless streaming audio and video applications at 2.4 GHz with a maximal data rate of 55 Mb/s over a 15 MHz channel bandwidth [26].
TG3a did actually manage to consolidate the physical layer from the twenty-three UWB physical layer (PHY) proposals into two:

- Multi-Band Orthogonal Frequency Division Multiplexing UWB (MB-OFDM UWB),
- Direct Sequence UWB (DS-UWB).

By the middle of 2003, consensus could, however, not be reached between the different industry alliances supporting these solutions. A second vote in July 2004 also could not reach a majority in the selection of a proposal. The MB-OFDM Group left the voting assembly, blaming Motorola for preventing their solution from getting the 75% majority needed to become the standard.

The MB-OFDM Group joined the WiMedia Alliance [27] to form an industrial consortium supported by heavy players like Intel, HP, Microsoft, Philips, Sony, Texas Instruments and Samsung. On the opposite side, Motorola and its offshoot Freescale Semiconductor formed the UWB Forum [28], whose members preferred the technology approach based on direct sequence (DS-UWB). After this episode, Freescale has concentrated its UWB efforts on its own version of wireless USB 2.0, called “Cable-Free USB” and, in the future, wireless 1394 (or wireless FireWire) and the High Definition Multimedia Interface (HDMI) [29].
2.6.3. The Standard Ecma 368/369

After the incident of the IEEE 802.15.3a task group, the decision process regarding high-speed UWB in the IEEE Standard remained jammed for the next three years. The disagreement finally led to the death of the IEEE 802.15.3a high-speed specification. The WiMedia Alliance decided to pursue standardization in an alternative body: Ecma. Ecma originally stood for *European Computer Manufacturers’ Association* and is now seen as an international body. In 2005, Ecma has released two proposals for high-speed UWB standardization based directly on the WiMedia proposal. In March 2007, the Ecma standard proposals have been approved for release as an ISO/IEC International Standard. The Ecma-368 [30] standard, entitled “High Rate Ultra-Wideband PHY and MAC Standard” was approved as ISO/IEC 26907, which specifies a distributed medium access control (MAC) sublayer and a physical layer (PHY) for wireless networks. The PHY and MAC specified in this Ecma standard are compatible to high data rate communications between a diverse set of mobile and fixed electronic devices. In conjunction with this standard, the Ecma-369 standard [31], titled “MAC-PHY Interface for Ecma-368”, was approved as ISO/IEC 26908 and specified the MAC-PHY interface for a high rate, ultra-wideband wireless transceiver. As of this writing, the third revision of the specification is complete.

2.6.4. Low Data Rate UWB: TG4a

Besides solutions aiming at transmission rates of hundreds of Mb/s over very short distances, UWB is also seen as an interesting technology for emerging communication and location-tracking systems with low-power consumption, low complexity and high-integration levels. In November 2002, the IEEE 802.15.4a Study Group has looked at UWB as a candidate for a potential alternative physical layer to the 802.15.4 standard for low-power, low-data-rate wireless networks. The IEEE 802.15.4a became an official Task Group [32] in March 2004. The main interest was in providing data communications, high precision ranging and location capability (1 meter accuracy and better), throughput in the range of Mb/s, and ultra low supply power of a few milliwatts.

The IEEE 802.15.4a managed to select a dual-PHY baseline standard that includes an UWB impulse-based radio (IR-UWB) for communications and ranging and a chirp spread spectrum (CSS) radio for communication only operating in the 2.4 GHz ISM band. The targeted
application is sensor networks. A standard from TG4a has been published by the IEEE in March 2007 and has been further amended in August 2007 to include the following alternate PHYs [33]:

- Ultra-wide band (UWB) PHY at frequencies from 3 GHz to 5 GHz, 6 GHz to 10 GHz, and below 960 MHz with mandatory data rate of 851 kb/s and optional data rates of 110 kb/s, 6.81 Mb/s, and 27.24 Mb/s;

- Chirp spread spectrum (CSS) PHY at 2450 MHz supporting data rate of 1 Mb/s and optionally 250 kb/s.

On the 22 March 2007, 802.15.4a was approved as a new amendment to IEEE Std 802.15.4-2006 by the IEEE-SA Standards Board [34].

2.7. Conclusions

Considering regulations and Standards, we summarize the current status as follows:

- the FCC regulation provides a wide bandwidth from 3.1 GHz to 10.6 GHz for UWB equipments emitting an equivalent isotropically maximum mean PSD of -41.3 dBm/MHz;

- the European ECC regulation provides only two reduced frequency bands, i.e. 6-8.5 GHz and 3.1-4.8 GHz, the latter has additional constraints in terms of duty cycle on the emitted energy and the former is seen as the long term band for UWB applications in Europe.

- the IEEE 802.15.4 was chartered to investigate low data rate and low power solutions with reduced complexity. An amendment of this Standard drafts an alternate PHY specification based on IR-UWB signalling.

In this work, we adopted a frequency band between 3.1 and 5 GHz for intentional emissions. The compliance with the FCC rules for a low-complexity transmitter using the IR-UWB principle was one of the main foci in the first part of this work (Chap. 3 and 4). With the apparition and the amendment of UWB rules in Europe, additional efforts have been dedicated to make the transmitter section sufficiently flexible to cope with the stringent ECC radio spectrum policy. Regarding the choice of a standard, we proposed in this work a proprietary solution, since the choices related the implementation of an entire transceiver have been made prior the advent of any standards.
3

Transceiver Planing

3.1. Key Considerations and Main Objectives

The elaboration of a transceiver specification for a radio communication system has to deal with several contradictory parameters and is first of all a matter of trade-offs. Unlike the wired system where a point-to-point path exist, a single path between wireless transceivers cannot be guaranteed and is usually subject to very varying propagation conditions. At the initiation of this thesis in late 2002, the use of the impulse-radio ultra-wideband (IR-UWB) wireless technology for commercial applications was a very new area with very few published works (see Fig. 3.1) and offered a new field of investigations.

The first task was thus the choice of a robust and flexible modulation scheme for the PHY layer of the wireless link. Following discussion with the industrial partner and in view of the difficulty of the task, it has been decided to keep a relative low complexity, especially in the receiver. The direct consequence had been the limitation in the bit rate even with the use of a large signal’s bandwidth allowed by the UWB radio technology. This prevented the modulation scheme from having bandwidth efficiency of today’s state-of-the-art wireless short range systems performing up to 5 bits/s/Hz as it is the case, e.g., in the latest Wireless Metropolitan Area Network (WMAN) Standard [35].

The main objectives for the design of a low-complexity transceiver operating in the UWB band can be enumerated as follows:

- no “brute force” digital signal processing (such as RAKE receiver using high-speed or high-resolution analog-to-digital converters (ADC), and requiring heavy signal post-processing);
• data rates and power consumption that outperform existing WPAN standards such as Bluetooth (typically larger than 1 Mbps and lower than 50 mW);
• robustness and reliability of the communication over UWB indoor channels.

More particularly, this work aims at the characterization of an entire and standalone transmitter-receiver IR-UWB link exhibiting a BER better than $10^{-2}$ for an equivalent free-space range of 10 m at 10 Mb/s.

![Figure 3.1. Number of IEEE publications per year containing the keyword “UWB” since 1988.](image)

**3.2. Radio Channel Basics**

**3.2.1. Introduction**

In this section, we describe the channel model that is used as a transfer function between the transmitter and the receiver. The channel model has to be simple enough to allow for efficient analysis, but have to capture most of the dominant characteristics of the propagation phenomena for a reliable analysis. In general, the propagation of radio waves can be categorized into four basic mechanisms: **path loss**, **absorption**, **reflection** and **diffraction**:

- **Path loss** The very basic mechanism of signal attenuation is the free space geometric path loss. Assuming a lossless medium and thus power conservation, the signal power through a virtual sphere of any radius centered on the emitter is constant. Therefore, the power through a fixed surface element (i.e. equivalent antenna aperture at the receiver) decreases with the square of
the distance, i.e. the radius of the sphere. The attenuation is often described mathematically using the well-known path loss model given by [36]:

\[ G_P(d) = G_0 - 10 \cdot n \cdot \log \frac{d}{d_0}, \]

where \( G_P(d) \), is the path loss at distance \( d \) and \( G_0 \) is the reference path loss at \( d_0 \) (usually \( d_0 = 1 \) m). Both path loss values are expressed in dB. In free space, the propagation exponent \( n \), which describes the rate of power attenuation, is equal to 2 but can take on higher value if the propagation path is obstructed, as explained in the next section. On the other hand, there exist conditions where \( n < 2 \); this mainly occurs in indoor line-of-sight (LOS) propagation environments where the energy of several propagation paths are reflected against obstacles (e.g. walls) and constructively added at the receiving point. These values are empirically extracted from data collected by measurement campaigns.

- **Absorption** is the second cause of signal attenuation and occurs when the wave propagates through a lossy medium. The involved mechanism is the conversion of the transmitted energy into another form, usually thermal. The conversion takes place as a result of interaction between the incident energy and the material of the propagation medium, at the molecular or atomic level. The overall phenomenon is called *shadowing* owing to its similarity to the effect of clouds partly blocking sunlight.

- **Reflection** occurs when a propagating wave strikes an object with large dimensions in comparison to the wavelength of the propagating wave, such as large metal objects. Depending on the objects’ materials, the wave’s energy will either be partially reflected or transmitted trough the object. Reflection may be specular (mirror-like) or diffuse (i.e., only the reflected energy is retained and not the image of the source) according to the nature of the surface. A perfectly conducting material which can be seen as a dielectric(air)-conducting interface will reflect all the incident waves and preserves - but inverses - the phase, while dielectric-dielectric interfaces may change the phase depending upon the frequency and the angle of incidence of the wave. At locations where many reflected waves exist, the received signal level tends
to be strongly varying owing to the constructive and destructive interferences between the received waves. This phenomenon is commonly referred to as multipath fading [37, 38] and is the main cause of radio signal dispersion propagating in indoor environments.

- **Diffraction** is the spreading out of waves and occurs when the wavefront of a radio wave hits obstacles whose dimensions are smaller than the wavelength. The result of this interaction is a secondary wavefront propagating in the shadowed region behind the obstructing objects. Thus, diffraction makes the radio signal appear to bend or travel around the obstructing objects. The diffraction mechanism often allows the reception of weakened radio signals when the LOS conditions are not satisfied. *Scattering* occurs when the propagation path contains obstacles whose dimensions are comparable to the wavelength. The nature of this phenomenon is similar to diffraction, except that the radio waves are scattered in a greater number of directions. Another kind of scattering may occur when the radio wave hits surfaces with sharp irregularities such as floors or walls. Surface roughness is a parameter that may lead to a phenomenon called wave cluttering, which manifests itself as a disorganized wave propagation [39]. Of all the mentioned effects, scattering and cluttering are the most difficult to predict.

- For the sake of completeness, we have to mention the refraction, which is due to a varying refractive index of the atmosphere, radio waves do not propagate along a straight line, but rather along a curved one. Therefore, the coverage area of an actual transmitter is usually larger than that predicted by LOS. Refraction has however a negligible impact on short range indoor propagation.

### 3.2.2. Wireless Channel Metrics

From the effects described in the previous section, we can divide the variations affecting the propagation channel into two types:

- path loss and absorption both are said to belong to large scale fading mechanisms; these effects reveal themselves as an attenuation of the average signal power in the sense that, for a small variation of the frequency or the position of the emitter and receiver, the transfer function of the channel does not change drastically (i.e.
3.2. Radio Channel Basics

a few dB’s and slowly). Large scale fading is described in terms of statistical variations about the mean;

- reflections and diffraction cause small-scale fading and are due to the constructive and destructive interference of the multiple signal paths between the transmitter and receiver. This occurs at the spatial scale of the order of the carrier wavelength and is strongly frequency dependent. It means that for a small change in frequency or position of the emitter-receiver pair, the channel’s transfer function may change completely, i.e. up to several tenth of dB’s, and suddenly.

We can generalize the formulation of a received signal $y(t)$ which results of the superposition of several attenuated and delayed replicas of the emitted signal $x(t)$ with the help the following formula:

$$y(t) = \sum_i a_i \cdot x(t - \tau_i),$$

(3.2)

where $\{a_i\}$ and $\{\tau_i\}$ are the attenuations and the delays of the $i$ paths, respectively. When considering mobile communications, the channel cannot be considered as time-invariant. Hence, the attenuation and delay coefficients are dynamic and the previous equation becomes:

$$y(t) = \sum_i a_i(t) \cdot x(t - \tau_i(t)).$$

(3.3)

It should be noted that although the individual time-variant coefficients are assumed independent of the frequency, the overall received signal $y(t)$ can still vary with frequency due to the fact that different path lengths have different delays, and consequently different phases.

Since the channel is linear, it can be described by the response $h(\tau, t)$ at time $t$ to an impulse transmitted at time $t - \tau$:

$$y(t) = \int_{-\infty}^{+\infty} h(\tau, t)x(t - \tau)d\tau.$$  

(3.4)

The impulse response of the channel at a fixed time $t$ is

$$h(\tau, t) = \sum_i a_i(t) \cdot x(t - \tau_i(t)).$$

(3.5)

The next step in creating a useful channel model is to convert the continuous-time channel into a discrete-time channel. In this kind of
model, the time axis is divided into small time intervals called “bins” or “taps”. Each tap is assumed to contain either one multipath component or no component. The possibility of more than one path in a bin is excluded but it may represent the aggregation of the physical paths all having the same delay corresponding to the considered bin. The bin width is thus related to the time resolution of the channel model or, equivalently, to the ability of the model to resolve distinct paths. Using this model, each impulse response can be described by a sequence of zeros and ones multiplied by a number which corresponds to the attenuation of the path for the given bin (a zero represents the absence of path in that bin). The phenomena of multipath propagation can be represented by the following discrete impulse response of the channel

$$ h(t) = \sum_{l=0}^{L-1} a_l \delta(t - lT_m), $$

(3.6)

where $a_l$ is the amplitude attenuation factor on path (or bin) $l$. $T_m$ is the resolution time, $L$ is the number of resolvable multipath components and $\delta(\cdot)$ is the Dirac delta function. Sometimes, Eq. 3.6 is referred to as the multipath intensity profile.

Delayed signals through a channel are described using the following definitions:

- **Average delay** which describes the mean travel time of the signal from the transmitter to the receiver;

- **Delay spread** which is a metric of how much the signal is diluted in time, and usually described by its RMS value $\tau_{\text{RMS}}$;

- **Maximum or total delay spread** which measures the largest delay due to the multipath.

### 3.2.3. Modeling Concepts

To study the channel properly, a characterization experiment known as ”channel sounding” is often carried out. It consists in applying a stimulus at one location and measuring the response at another. The channel impulse response of channels may be analyzed using different techniques, such as summarized hereafter:

- **Swept carrier techniques** are channel measurement methods in frequency domain, where a channel is excited with a continuous
wave signal [40]. The frequency is swept or incremented over the frequency range of interest. The time domain impulse response is obtained by calculating the inverse Fourier transform of the frequency response.

- The **direct pulse** measurement principle is performed directly in time domain and provides the impulse response theoretically immediately [41]. In practice, this technique suffers from some impairments: 1) a high peak-to-average signal is needed to detect weak reflections, and 2) the environment must be free from interferers to obtain reliable results. Practical implementations of direct pulse sounding with coding gain enhancement can however be found in [42].

- The third technique is called **pulse compression** and combines the advantage of the high energy of long duration signals and the higher resolution in time domain of short pulses. It uses the correlation property of (pseudo-) random signals. This signal, when fed through a linear time invariant (LTI) system, results in a response, whose autocorrelation function produces coefficients that are proportional to the LTI channel [43].

Swept carrier and pulse compression techniques are preferred, since they perform better in the presence of noise and have usually much better dynamic range that allows the detection of strongly faded signals. A detailed review of theses techniques is provided in [44–46].

### 3.3. The UWB Channel

#### 3.3.1. Introduction

Most of the traditional channel or propagation models are based on the **narrowband assumption**, i.e. the system bandwidth is a very small fraction (less than one percent) of the center frequency. In the narrowband case, such as used in GSM or other cellular technologies, the channel can be represented with a reduced number of paths, usually four or five. This assumption actually no longer holds for UWB. The difference lies principally on the temporal resolution needed to account for the propagation phenomena of large bandwidth channels. UWB channels can have up to several hundred bins for a correct representation. Furthermore, the amplitude and delay coefficients of Eq. 3.5 are no more frequency-independent due to the considered large bandwidth.
The following sections also provide a review of the UWB channel models and their evolution since the amendment of the FCC regulation in 2002.

### 3.3.2. Early UWB Multipath Models

The model used for this thesis is based on the work presented by Intel to the IEEE P802.15, the IEEE Working Group for WPAN, and issued in December 2002 [47]. This propagation model, which has been mainly based on measurements, suggests an interesting phenomenon. As illustrated in Fig. 3.2, path arrivals tend to come in clusters rather than uniformly spaced in time. This was also observed in several other indoor channel measurements at lower frequencies (typically below 2 GHz), such as by Saleh and Valenzuela [48], Hashemi [49] or Rappaport and Sandhu [50]. The Saleh-Valenzuela model seemed to best fit this clustering phenomenon and was adopted by Intel with some slight modifications. The cluster phenomenon comes from the fine spatial resolution of wideband signalling. Assuming a pulsed UWB signal occupying 1 GHz, the signal, whose duration is 1 ns, can resolve paths separated by 30 cm. Hence, different parts of the same object (eg. wall, furnishings, people...) reflect several returns, all parts of one cluster.

![Figure 3.2. Illustration of the cluster phenomenon observed in the power decay profile of UWB channels.](image)

The Intel measurements, taken in the UWB band, observe empirically a log-normal\(^1\) distribution for multipath fading coefficients; this distribution seemed to better fit the measurements than the Rayleigh\(^2\)

---

\(^1\)A log-normal distribution is a probability distribution of a random variable whose logarithm is normally distributed.

\(^2\)The Rayleigh distribution is frequently used to model multipath fading with no direct line-of-sight (LOS) path.
distribution used in [48]. The model also statistically characterizes the multipath arrival times and the fading of both cluster and rays within the cluster in an independent manner. Therefore, to account for the cluster effect, the simple multipath Eq. 3.2 can be rewritten as

\[ h^{[i]}(t) = X^{[i]} \sum_{l=0}^{L-1} \sum_{k=0}^{K-1} \alpha^{[i]}_{k,l} \delta(t - T^{[i]}_l - \tau^{[i]}_{k,l}), \]  

(3.7)

where \( i \) denotes the \( i^{th} \) channel realization, the multipath gain is defined by \( \{\alpha^{[i]}_{k,l}\} \) for cluster \( l \) and ray \( k \) and has a log-normal distribution. \( \{T^{[i]}_l\} \) is the delay of the \( l^{th} \) cluster, \( \{\tau^{[i]}_{k,l}\} \) is the delay of the \( k_{th} \) multipath component (ray) relative to the \( l^{th} \) cluster arrival time \( T^{[i]}_l \). Both cluster and ray arrival time are described by Poisson processes. Finally, the log-normal shadowing of the total multipath energy is modelled by \( \{X^{[i]}\} \). The Intel model provided four indoor channel models (CM), namely CM1 to CM4 for different environments.

### 3.3.3. The IEEE 802.15.4a Model

Later, the IEEE 802.15.4a Channel Modelling Subgroup developed a more general model, which encompass a wider range of environments such as factories or warehouses and each of them in both line-of-sight (LOS) and non-line-of-sight (NLOS) scenarios. While it is partly based on a Saleh-Valenzuela approach [48], it includes a Nakagami or Rice\(^3\) small-scale fading (instead of log-normal). The final report describing this model is available under Ref. [51]. This model proposes eight different scenarios (CM1-CM8) for UWB channels from 2 to 10 GHz. The corresponding definition of each channel model scenario can be found in Table 3.1. In Fig. 3.3, the average power decay profiles of the six indoor scenarios (CM1-4 and CM7-8) are illustrated for LOS and non-line-of-sight (NLOS). CM5 and CM6 are modeling outdoor scenarios and are not considered in this work. An example of the impulse response of a typical channel realization for the channel model CM3 (indoor office LOS) is depicted in Fig. 3.4. This figure clearly illustrates the “cluster” phenomenon occurring to the time of arrival of the different multipath components.

---

\(^3\)The Rice model (or Nakagami-n) is often used to model propagation paths consisting of one strong direct LOS component and many random weaker components.
Figure 3.3. Average power decay profiles of IEEE802.15.4 indoor channel models (100 realizations). Left column is LOS and right column shows NLOS. The average delay spread $\tau_{RMS}$ varies over one decade from 9 ns (CM7) to 89 ns (CM8).
3.3. The UWB Channel

![Channel model: CM3r1](image)

**Figure 3.4.** Energy-normalized impulse response of channel model CM3 (realization 1) given both in linear (top) and decibel scale (bottom). This impulse response illustrates well the “cluster” phenomenon affecting the time of arrival of the different multipath components (see Section 3.3.2).

3.3.4. The Path Loss of the IEEE 802.15.4a Model

The path loss in a narrow band free-space environment is conventionally defined as

\[ L_P = \left( \frac{4\pi d}{\lambda} \right)^2 = \left( \frac{4\pi df}{c_0} \right)^2 \]  

where \( d \) is distance between the receiver and the transmitter, \( \lambda \) is the wavelength, \( f \) the frequency and \( c_0 \) the speed of light. For example, the free space attenuation at a center frequency of 5 GHz and over a distance of 1 m is 46.42 dB. A typical rule of thumb is to avoid the use of this path loss equation for distances less than about a wavelength. Below this distance, physical factors like antenna dimensions dominate and force the use of electromagnetic field equations (Maxwell equations).

The current model for path loss with UWB channels uses a similar formula, but the wavelength \( \lambda \) is computed at the center frequency (geometrical mean of upper and lower 10 dB cutoff frequency) of the
system. From [51], we propose the following path loss $G_P(d, f)$ for UWB channels:

$$G_P(d, f) = G_0 + 10 \cdot n \cdot \log_{10} \frac{d}{d_{0}} + 20 \cdot \log \frac{f}{f_{m}} + A_{body}, \quad (3.9)$$

where $G_0$ is the nominal path loss for $d = d_0 = 1$ m and is given in Table 3.1 for the different scenarios of the IEEE 802.15.4a channel model. The parameter $n$ is the propagation exponent, $d$ is the distance between the receiver and the transmitter, $f$ is the geometric center frequency of the UWB signal and $f_m$ is the frequency at which the reference channel measurements have been done ($f_m = 5$ GHz). $A_{body}$ accounts for the presence of a person (user) close to the antenna. This proximity leads to an attenuation varying between 1 dB and more than 10 dB, depending on the user and the frequency. Based on the aforementioned ref. [51], the attenuation factor has been fixed to a value equal to 0.5 ($A_{body} = 3$ dB). Some publications however tend to prove that this value is far too optimistic. Reference [52] or [53], considering measurements that have both emitter and transmitter mounted on the same body to build a WBAN, showed additional body attenuation up to 20 dB in a light multipath environment.

<table>
<thead>
<tr>
<th>CM</th>
<th>Description</th>
<th>$G_0$ [dB] (d_0 =1m)</th>
<th>n</th>
</tr>
</thead>
<tbody>
<tr>
<td>CM0</td>
<td>Free space</td>
<td>46.4 dB</td>
<td>2.00</td>
</tr>
<tr>
<td>CM1</td>
<td>Residential - LOS</td>
<td>43.9 dB</td>
<td>1.79</td>
</tr>
<tr>
<td>CM2</td>
<td>Residential - NLOS</td>
<td>48.7 dB</td>
<td>4.58</td>
</tr>
<tr>
<td>CM3</td>
<td>Indoor office - LOS</td>
<td>35.4 dB</td>
<td>1.63</td>
</tr>
<tr>
<td>CM4</td>
<td>Indoor office - NLOS</td>
<td>59.9 dB</td>
<td>3.07</td>
</tr>
<tr>
<td>CM5</td>
<td>Open outdoor - LOS</td>
<td>45.6 dB</td>
<td>1.76</td>
</tr>
<tr>
<td>CM6</td>
<td>Open outdoor - NLOS</td>
<td>73.0 dB</td>
<td>2.50</td>
</tr>
<tr>
<td>CM7</td>
<td>Indoor industrial - LOS</td>
<td>56.7 dB</td>
<td>1.20</td>
</tr>
<tr>
<td>CM8</td>
<td>Indoor industrial - NLOS</td>
<td>56.7 dB</td>
<td>2.15</td>
</tr>
</tbody>
</table>

Table 3.1. Path loss of the IEEE 802.15.4a channel model derived for $f_m = 5$ GHz.
3.3.5. Channel Dynamics

When considering mobile applications, the characteristics of the channel may vary in a temporal or in a spatial manner. These variations are caused by the displacement of the emitter and/or the receiver. Many statistical studies on the fluctuations of UWB signals are available in the literature. Most of them focus on the spatial fluctuation and are based on experiments involving the measurement of the received power by moving the antenna on a grid, resulting in series of channel realizations, such as in [54, 55]. This technique considers the spatial fluctuations experienced by a mobile terminal for LOS conditions.

Other causes of fluctuations may result from objects or persons moving in the vicinity of the emitter-receiver system. The nature of this temporal dynamic behavior has been hardly investigated in the literature; some studies can however be found in [48, 56]. These references show that the indoor channel can be treated as quasi-static or very-slowly time varying.

The quasi-static assumption is the reason why channel dynamics are not considered in this work with regard to the specifications of the receiver. Only some parameters, such as the time constant in the gain control of the variable gain amplifier (Sec. 6.5) may have an influence on the ability of the receiver chain to track a varying signal amplitude. It is however assumed that this tracking speed can be made much faster than the variability of the channel characteristics.

3.3.6. Conclusions

A major difference between UWB and narrowband channels lies in the number of paths that has be considered to model the channel. An other important aspect is related to the fact that UWB is primarily intended for indoor applications. Indoor and outdoor channels are similar in their basic features, since they both experience multipath dispersions. However, major differences can be enumerated as follows [57]:

- the indoor channel can be time- and space-variable due to the mobility of the emitter and the receiver, whereas conventional mobile channels are only variant in space because of the stationarity of the basis station;

- path loss in an indoor environment may be very severe and very fluctuating;
• rapid motions and high velocities are absent in indoor environments where the Doppler shift is consequently negligible;

• excess delays are smaller in indoor channels thanks to the shorter transmission ranges involved, this allows higher transmission rates.

Regarding the two first points, early experiments demonstrated the robustness and the reliability of UWB signals in dense multipath indoor environments [58] by measuring the mean and the variance of the signal strength at different locations; the authors used a 1 GHz bandwidth signal and showed a maximum variation of only 5 dB as the receiver changed its position in a room. This is considerably less than the fading of narrowband signals, which may be up to more than 20 dB in similar environments [59–61].

This aspect brings an interesting advantage for the design of a low-complexity receiver. Smaller fading margin enables a better tracking of the incoming signal power and reduces the amount of blind periods, during which the receiver may loose synchronization.

### 3.4. IR-UWB Signals

This section proposes a simple metric to characterize carrier based IR-UWB signals. It uses the specific properties of the Gaussian function.

#### 3.4.1. Gaussian IR-UWB Pulse Metrics

In this chapter, we assume that the envelope \( s(t) \) of a pulsed signal can be described by a Gaussian function. Hence, \( s(t) \) is defined as

\[
s(t) = \frac{a}{b} \cdot e^{-\pi \left(\frac{t}{b}\right)^2}, \tag{3.10}
\]

where \( a/b = A \) is the pulse peak amplitude and \( b \) is a factor proportional to the pulse length; \( s(t) \) is depicted in Fig. 3.5. The Fourier transform of \( s(t) \) is

\[
S(f) = \int_{-\infty}^{\infty} s(t) e^{-2\pi if t} dt = ae^{-\pi (bf)^2}. \tag{3.11}
\]

An interesting property of a Gaussian pulse is that its Fourier transform results again in a Gaussian function. This simple relationship between
a Gaussian signal expressed in time and frequency domain will allow us to easily derive the equations, which describe the relation between the input and output signals of Gaussian filters, as used later in Section 3.7.3, for example.

The energy $E_s$ of a Gaussian pulse $s(t)$, whose Fourier transform is $S(f)$ as given in Eq. 3.11, can be calculated directly from the temporal expression $s(t)$. It can also be obtained by integrating the energy spectral density $\Phi_s(f)$ (ESD) of the signal over the frequency $f$. The definition of the ESD is $\Phi_s(f) = |S(f)|^2$. The energy $E_s$ can be expressed as

$$E_s = \int_{-\infty}^{\infty} [s(t)]^2 dt = \frac{a^2}{b \cdot 2\sqrt{2}} \operatorname{erf} \left( \sqrt{2\pi} \frac{t}{b} \right) \bigg|_{-\infty}^{\infty}$$  

$$= \int_{-\infty}^{\infty} \Phi_s(f) df = \frac{a^2}{b \cdot 2\sqrt{2}} \operatorname{erf} \left( \sqrt{2\pi} bf \right) \bigg|_{-\infty}^{\infty}.$$  

Since the integration of $\operatorname{erf} (\cdot)$ over $[-\infty, \infty]$ equals unity, the result energy of a single baseband Gaussian pulse ("envelope" signal) is

$$E_s = \frac{a^2}{b \sqrt{2}} = \frac{A^2 b}{\sqrt{2}}.$$  

Figure 3.5. Normalized Gaussian pulse in time domain ($a = b = A = 1$). The half-amplitude length is approximately equal to the parameter $b$. 

3.4. IR-UWB Signals
Note that the energy spectral density of a single pulse is expressed in Ws/Hz.

### 3.4.2. Frequency Translated Gaussian Pulse

When considering the frequency translated version of the baseband Gaussian signal, the envelope $s(t)$ has to be modulated by a cosine wave. The modulation actually corresponds to a carrierless process which is also known as DSBC (double sideband suppressed carrier) modulation. The resulting two-sided spectrum $S_{RF}(f)$ is formed by the baseband spectrum $S(f)$, which is translated at $+f_{RF}$ and $-f_{RF}$ and whose amplitude is halved. The corresponding two-sided ESD $\Phi_{RF}(f)$ is thus given by

\[
\Phi_{RF}(f) = [S_{RF}(f)]^2 = \left[ \frac{1}{2} S(f - f_{RF}) + \frac{1}{2} S(f + f_{RF}) \right]^2 \\
\approx \frac{1}{4} \left[ S^2(f - f_{RF}) + S^2(f + f_{RF}) \right] 
\]

Signals exhibiting a Gaussian-shaped spectrum $S(f)$ actually have an infinite bandwidth. The approximation of Eq. 3.16 is thus only valid for signals having sufficiently small bandwidth with respect to the translation frequency $f_{RF}$ (also called carrier frequency). The two-sided energy spectral density of the frequency translated Gaussian pulse is reduced by a factor four and the total energy is half the energy of the baseband Gaussian pulse,

\[
E_{s,RF} = \frac{a^2}{b \cdot 2\sqrt{2}}. 
\]

Equivalently, if we express signals with voltage and impedance, we get:

\[
E_{s,RF} = \left( \frac{a}{b} \right)^2 \frac{b}{2\sqrt{2}} = \frac{A_{RF}^2}{Z} \frac{b}{2\sqrt{2}}, 
\]

where $A_{RF}$ is the voltage amplitude of the pulse at impedance $Z$.

---

4 A two-sided spectrum displays half the energy at the positive frequency and half the energy at the negative frequency. Therefore, to convert a two-sided spectrum to a single-sided spectrum, we discard the second half of the array and multiply every point of the two-sided spectrum except for DC by two.
3.4.3. Pulse Bandwidth (Ideal Case)

The first constraint is related to the baseband signal’s -10 dB bandwidth, which has to be wider than $B_{\text{min}} = 500$ MHz. The -10 dB bandwidth is defined by the following relation

$$\frac{S^2(0)}{S^2(B_s/2)} = \frac{a^2}{\left[a \cdot e^{-\pi (b B_s)^2}\right]^2} = 10. \quad (3.19)$$

Thus, the relation between the -10 dB bandwidth $B_s$ of the equivalent baseband pulse and the parameter $b$ is given by

$$b = \frac{2}{B_s} \sqrt{-\frac{\ln(\sqrt{1/10})}{\pi}} \approx \frac{1.211}{B_s}. \quad (3.20)$$

For $B_s = 500$ MHz, as specified by the FCC regulation, we get $b \approx 2.42 \cdot 10^{-9}$ (in seconds).

To obtain the equivalent PSD, the ESD must be multiplied by the pulse period $R_p$:

$$\Phi_{\text{RF}} = \Phi_{\text{RF}}(f_{\text{RF}}) = 2 \Phi_{\text{RF}} R_p = \frac{a^2}{2} R_p < \phi_{\text{reg}}, \quad (3.21)$$

According to Eq. 3.11 and 3.16, the peak value of the one-sided ESD is $\frac{a^2}{2}$ and must be correctly chosen in order to comply with the regulation. Note that, in the case of a Gaussian pulse and for a defined signal bandwidth $B_s$, it won’t be possible to reach the maximum transmitted power of a perfectly flat spectrum within the defined bandwidth due to the non ideal spectrum shape. This limitation can be seen as an implementation loss of the system and will be called “non-flat spectrum implementation losses” $L_{\text{nfs}}$.

Assuming a pulse repetition rate $R_p$ in pulses per second, the maximum PSD value located at $f_{\text{RF}}$ must satisfy

$$\hat{\Phi}_{\text{RF}} = \Phi_{\text{RF}}(f_{\text{RF}}) = 2 \Phi_{\text{RF}} R_p = \frac{a^2}{2} R_p < \phi_{\text{reg}}, \quad (3.22)$$

where $2 \Phi_{\text{RF}}$ is the peak value of the one-sided ESD and, from Eq. 3.16, is $2 \cdot \frac{1}{4} \cdot S(0)^2 = \frac{a^2}{2}$. The term $\phi_{\text{reg}}$ corresponds to the maximum (single-sided) PSD allowed by the regulation (75 nW/MHz or -41.3 dBm/MHz):

$$\phi_{\text{reg}} = 75 \text{ fW/Hz}. \quad (3.23)$$
Thus, for $R_p$ equal to 10 MHz, $a$ is roughly equivalent to $1.22 \cdot 10^{-10}$ and the $1 \Omega$-normalized amplitude $A = \frac{a}{b}$ of the pulse is $A = 0.005$ (which corresponds to a pulse amplitude of $A_{RF} = 355$ mV under 50 $\Omega$). The energy per pulse is $E_{s,RF} = 2.19$ pJ. Thus, the total emitted average power $P_{Tx}$ is given by the energy of a pulse $E_{s,RF}$ multiplied by the pulse rate $R_p$, i.e.,

$$P_{Tx} = E_{s,RF} \cdot R_p = -16.6 \text{ dBm}. \quad (3.24)$$

For a signal that exploits ideally the minimum 500 MHz bandwidth $B_{min}$ (i.e. “brick-wall” spectrum), the maximum average power $P_{max}$ that can be transmitted is

$$P_{max} = \phi_{reg} \cdot B_{min} = -14.26 \text{ dBm}. \quad (3.25)$$

As a consequence, the loss due to the nonflat spectrum is $L_{nfs} = P_{max} - P_{inband} = 2.34$ dB; this corresponds to a spectral occupancy of 76.4%, which can be considered as satisfactory for a low-complexity transmitter.

### 3.4.4. How Wide Should an IR-UWB Pulse Be?

The FCC regulation fixes the minimum bandwidth for an UWB signal to 500 MHz, whereas the European rules ratified it one order of magnitude lower at 50 MHz (Section 2.2 and 2.3, respectively). One may ask which one of these minimum bandwidths is adequate in terms of communication link reliability and transceiver complexity? It is known from wireless communication theory [36] that the signal bandwidth must be smaller than the coherence bandwidth $BW_c$ of the channel to avoid equalization. The coherence bandwidth $BW_c$ is defined as the interval over which two frequencies of a signal are likely to experience comparable or correlated amplitude fading, when propagating through a multipath environment. When the signal bandwidth is smaller than the coherence bandwidth, a situation of flat-fading occurs and the system does not require any equalization. Mathematically, $BW_c$ is related to the RMS delay spread $\tau_{RMS}$ as follows

$$BW_c = \frac{1}{\alpha \cdot \tau_{RMS}}, \quad (3.26)$$

where $\tau_{RMS}$ is expressed in seconds, and $BW_c$ in Hz. The coefficient $\alpha$ is strongly dependent on the channel statistics. For a probability of
correlation of 90% \( (BW_{c,90}) \) between the considered frequency components, values between 0.5 and 50 are observed for \( \alpha \) in indoor channels \([36, 62]\) and hence requires an evaluation for a particular channel. The exact calculation of the coherence bandwidth is defined as the autocorrelation of the channel frequency response \( H(f) \) \([63]\)

\[
BW_c(\Delta f) = \int_{-\infty}^{+\infty} H(f) \cdot H^*(f + \Delta f) \cdot df. \tag{3.27}
\]

Applying this equation on the indoor channel model proposed by the IEEE802.15.4a Committee results in values summarized in Table 3.2. Although much larger than in outdoor channels (eg. for mobile tele-

<table>
<thead>
<tr>
<th>Channel</th>
<th>( \tau_{RMS} )</th>
<th>( BW_{c,90} )</th>
<th>( \alpha )</th>
</tr>
</thead>
<tbody>
<tr>
<td>CM1</td>
<td>17 ns</td>
<td>3.9 MHz</td>
<td>19.6</td>
</tr>
<tr>
<td>CM3</td>
<td>10 ns</td>
<td>9 MHz</td>
<td>11.1</td>
</tr>
<tr>
<td>CM7</td>
<td>9.1 ns</td>
<td>122 MHz</td>
<td>0.9</td>
</tr>
</tbody>
</table>

From Table 3.2, we observe that the maximum coherence bandwidth for IEEE 802.15.4a indoor LOS channels is 122 MHz (CM7). Choosing a minimum pulse bandwidth two time larger than this value, i.e. approximately 250 MHz, will ensure that the frequency components forming the pulse are uncorrelated and the attenuation due to frequency-selective fading is reduced. To illustrate this, the Fourier
transform of the channel’s impulse response between 3 and 5 GHz is given in Fig. 3.6 (thin line). This transfer function corresponds to the channel impulse response given in Fig. 3.24 (CM3r1). As can be observed, the transfer function experiments deep fades up to -30 dB. The thick curve illustrates the amplitude variations of a 250 MHz wide Gaussian pulse sent through the channel. Owing to the large pulse bandwidth extending beyond coherent bandwidth, the pulse amplitude variation is reduced down to ±3.3 dB over the entire UWB lower band (horizontal dashed lines), while the fading in the narrowband case extends beyond ±22 dB. A summary of peak-to-peak signal fluctuations for both narrowband signal and 250 MHz Gaussian pulse in LOS indoor channels between 3 and 5 GHz is given in Table 3.3. In indoor LOS channels, a 250 MHz Gaussian pulse exhibits average peak-to-peak signal fluctuation smaller than 10 dB over the 3-5 GHz frequency range. This is considered as sufficient in a communication link to avoid harmful signal losses caused by small-scale fading.

<table>
<thead>
<tr>
<th>Channel</th>
<th>Narrowband</th>
<th>250-MHz Gaussian pulse</th>
</tr>
</thead>
<tbody>
<tr>
<td>CM1</td>
<td>(\mu = 38.6) dB, (\sigma = 5.6) dB</td>
<td>(\mu = 4.6) dB, (\sigma = 1.5) dB</td>
</tr>
<tr>
<td>CM3</td>
<td>(\mu = 42.2) dB, (\sigma = 7.5) dB</td>
<td>(\mu = 6.3) dB, (\sigma = 1.5) dB</td>
</tr>
<tr>
<td>CM7</td>
<td>(\mu = 27.0) dB, (\sigma = 12.1) dB</td>
<td>(\mu = 9.6) dB, (\sigma = 3.6) dB</td>
</tr>
</tbody>
</table>

### 3.4.5. Non-dithered Signal Spectrum

The mathematical expression developed in Section 3.4.1 is only valid for a single pulse, and corresponds the ESD. When considering a certain repetition of pulses, spectral lines appear and are placed at the multiple of the repetition frequency \(R_p\). To ensure the same total transmitted power in \(B_{min}\) with respect to \(R_p\), the pulse amplitude must be varied proportionally to the square root of the inverse of \(R_p\) (-3 dB/octave dashed line in Fig. 3.7). The spectrum regulation states that measurement shall be made using a spectrum analyzer (SA) with a 1 MHz resolution bandwidth (RBW).\(^5\) Thus, when \(R_p\) becomes larger than RBW (round marker), spectral lines appear in the PSD and the pulse amplitude needs to be reduced proportionally to the inverse of the

---

\(^5\) The video bandwidth should be made three time the RBW, such that the video filter does not attenuates the spectrum component.
Figure 3.6. Fourier transform of a realization of the IEEE802.15.4a channel’s impulse response between 3 and 5 GHz for channel CM3r1 (thin line, see corresponding impulse response in time domain in Fig. 3.24). The thick line illustrates the signal energy that could be collected when passing a 250 MHz Gaussian pulse through the channel CM3r1; the energy fluctuation over the UWB band due to multipath is strongly reduced, owing to the fact that the pulse’s bandwidth extends beyond the coherence bandwidth of the channel.

\( R_p \), in order to comply with the mask (solid curve). For example, at \( R_p = 10 \) MHz, a 10 dB power back-off must be taken into account in the link budget of the transceiver. For medium data rate applications (tens of Mp/s), the power back-off may be a serious issue. There are many ways to overcome the undesired spectral lines. They are known as spectrum “whitening” or “dithering” and comprises methods that randomize

- the pulse position, such as time-hopping pulse position modulations (PPM);
- the pulse polarity, such as binary phase shift keying (BPSK);
- the pulse phase, as with non-coherent modulation schemes.

Another way to overcome the back-off issue is to use frequency modulation or hopping. Although the latter was much discussed at the
Figure 3.7. Unmodulated pulse amplitude required to meet the FCC mask when the pulse repetition rate $R_p$ is swept for a fixed RBW=1 MHz. The results are expressed in terms of a normalized amplitude (round marker). Normalized pulse amplitude for a constant transmitted power is depicted by the dashed curve. Above $R_p = 1$ MHz, an additional power back-off must be considered due to the presence of spectral lines in the PSD (solid curve), which leads to a reduction of the transmitted power.

adoption of the UWB (see 2.5), the emission of pulses in $N$ different frequency bands allows an apparent pulse rate $P_r$ in one frequency band $N$ times smaller. As a consequence, the power back-off is now reduced by $10 \cdot \log (N)$. The proposed modulation using binary frequency shift keying (BFSK) makes use of this property (see next Section).

3.5. The Modulation Scheme

3.5.1. Modulation Schemes for IR-UWB

There are basically four ways to convey digital information onto an electromagnetic wave: amplitude, frequency, phase and time. The choice of a digital modulation is limited by a set of constraints imposed by the targeted wireless link. These constraints typically are:

- the required signal-to-noise ratio (SNR) measured at the input of the receiver to achieve a given BER;
3.5. The Modulation Scheme

- the wireless link robustness with channels exhibiting multipath propagation, fading or dispersion caused by phase and delay distortion;
- the resistance against interferences and jamming;
- the spectral characteristics;
- the costs, the complexity and the power requirement to implement the solution.

The two first constraints are of primary concern in this work, whereas the last one is of importance when targeting commercial applications. As discussed in Chapter 2, we focus on signalling schemes that can be demodulated by simple noncoherent methods. This excludes the use of phase-based modulations, which require sophisticated methods for signal detection and longer overhead periods for synchronization.

The most simple noncoherent receiver that is often found in textbooks is based on a front-end bandpass filter, an envelope detector, a sampling-and-hold device and decision device (Fig. 3.8). This generic receiver is used to demodulate on-off keying (OOK) signals.

![Figure 3.8. Block schematic of the generic noncoherent receiver. The input bandpass filter limits the amount of noise before envelope (or nonlinear) detection.](image)

This type of receiver cannot be used for IR-UWB since the BER performance depends on the bandwidth $B_{RF}$ of the input bandpass filter. Low data rate IR-UWB systems, as targeted in this thesis, employ waveform bandwidths exceeding the bitrate $1/T_{\text{bit}}$ ($B_{RF} \cdot T_{\text{bit}} \gg 1$). For a given BER, this leads to an SNR that has to be increased approximately by a factor $B_{RF} \cdot T_{\text{bit}}$. Hence, IR-UWB receivers must feature baseband detection principles based on matched filter (or correlation) approaches [64]. We will see later in this chapter, that this requirement can be relaxed by implementing sub-optimum matched filter detection methods, which are based on the “energy collection” or “integrate-and-dump” (I&D) principle. This simply corresponds to signal integration or, equivalently, a correlation with a rectangular function of finite duration $T_i$. This very simple method still provides acceptable performance
for low-complexity IR-UWB receivers, especially in multipath environments.

As a starting point for the choice of a modulation scheme, we review in this section different solutions for noncoherent carrier-based IR-UWB, under the assumption that no multipath fading is present and that the incoming RF signal is corrupted by additive white Gaussian noise (AWGN) only. A collateral question arising with the choice of a modulation is how the bandpass RF signal has to be converted to a baseband signal with minimum SNR degradation (Section 3.5.3)? As a benchmark, an I&D filter detection, whose duration is equal to the pulse length, will be used to compare each modulation. This ensures that baseband signals, after frequency translation or removal of the RF carrier, are optimally detected.

**OOK**

Receivers for OOK as illustrated in Fig. 3.9-a are very popular for low complexity wireless systems. To avoid classical local oscillator and mixer devices, it has often been proposed to downconvert bandpass IR-UWB signal by means of a nonlinear device (e.g., square-law or envelope detector). Such devices usually degrade the SNR before the detection process. This degradation is caused by the modification of the noise distribution due to the nonlinear device, which makes the matched filter principle no longer optimum for detection.

Ref. [65] derived an expression for OOK with square-law detection device that approximates reasonably the BER performance with an I&D detection method using integration time of $T_i > 10/B_{RF}$:

$$P_{e,OOK,\cap} \approx Q\left(\frac{2 \cdot \frac{E_b}{N_0}}{\sqrt{T_i B_{RF}} + \sqrt{T_i B_{RF}} + 4 \frac{E_b}{N_0}}\right), \quad (3.28)$$

where $B_{RF}$ in the bandwidth of the input bandpass filter and $T_i$ is the integration length of the simplified matched filter ($s_{mf}(t)$ in Fig. 3.9-a consists in a rectangular function of duration $T_i$) and $E_b/N_0$ is the SNR per bit. The function $Q(x)$ is the Gaussian error integral and is defined as

$$Q(x) = \frac{1}{\sqrt{2\pi}} \int_x^\infty e^{-\frac{t^2}{2}} dt, \quad x \geq 0. \quad (3.29)$$

The theoretical BER curves for $B_{RF} \cdot T_i = 1$ and $B_{RF} \cdot T_i = 10$ (i.e., $T_i = 2$ ns and $T_i = 20$ ns, respectively, for $B_{RF} = 500$ MHz) are
given by the dash-dotted blue curves in Fig. 3.10. The reference curve in Fig. 3.10 (continuous) corresponds to the BER performance obtained by the demodulation with the same integration time $T_i$ of baseband OOK signals corrupted by AWGN (orthogonal signalling). Results show that, for a carrier-based OOK signalling scheme down-converted by a square-law function, the SNR required at $10^{-3}$ BER increases by approximately 1.5 dB from 10.5 dB to 12.1 dB for $B_{RF} \cdot T_i = 1$, whereas for $B_{RF} \cdot T_i = 10$, it increases to 13 dB. The BER degradation is usually acceptable for low-complexity receivers.

However, OOK signalling raises some important issue for practical implementation. The strong dependence of the signal level with the data makes setting of a threshold value for optimum decision difficult. The optimum threshold does not correspond to half the pulse energy, since the nonlinear device increases the mean value of the demodulated signalalue for “off” states not in the same manner as for the “on” states. Theoretically, the optimum threshold is defined by a complex equation depending on the SNR, whereas in practice, it requires an a posteriori setting based on a training sequence of equiprobable “0” and “1”.

**Binary PPM (BPPM)**

PPM has always been seen as one of the best candidates for IR-UWB owing to the simplicity of its implementation. As depicted in Fig. 3.9-b, a benefit over OOK is the presence of a signal for any symbol, either in the first or the second half of the bit period $T_{bit}$. Moreover, the signal detection is much easier than for OOK since the energy in each of these half bit periods is simply compared. When using the same conditions as for Eq. 3.28, [65] derived this expression for the probability of error of BPPM with I&D detection:

$$P_{e,BPPM,∩} \approx Q \left( \sqrt{\frac{1}{2} \cdot \frac{E_b}{N_0} + B_{RF} T_i} \right)$$

(3.30)

The corresponding BER curve for $B_{RF} \cdot T_i = 1$ and $B_{RF} \cdot T_i = 10$ are reported in Fig. 3.10 (dashed curves).

**Noncoherent Binary FSK**

We assume a transmitted signal modulated at center frequencies $f_c \pm \Delta f/2$. The noncoherent demodulation of BFSK, illustrated in
Fig. 3.9-c is equivalent to noncoherent PPM, since it relies on the detection of a signal in either of the two frequency bands (instead of two half bit periods) centered around $f_c + \Delta f/2$ and $f_c - \Delta f/2$. The disadvantage is the need of “preselection” bandpass filters centered at $f_c \pm \Delta f/2$, which double the circuit complexity. On the other hand, BFSK features advantages over both OOK and PPM. First, for the same pulse rate, it is less affected by interpulse interference (IPI) and second, it only requires a zero threshold voltage for the decision operation.

**Figure 3.9.** Typical modulation schemes and practical receiver implementation proposed for carrier-based IR-UWB wireless communication. a) Carrier based OOK modulation scheme with noncoherent receiver based on square-law demodulator, b) binary PPM scheme with noncoherent receiver, c) binary FSK scheme with classical noncoherent demodulation.
3.5. The Modulation Scheme

Figure 3.10. Bit error rate performances of noncoherent receivers using OOK, BPPM and BFSK as illustrated in Fig. 3.8 with integration duration $T_i = 1/B_{RF}$ (simulated) and $T_i = 10/B_{RF}$ (given by Eq. 3.28 and 3.30). Since each of these modulation schemes are derived from orthogonal modulation, they are compared to the optimum (matched filter) BER performance of an orthogonal signalling scheme (thin continuous curve). Thick continuous curves shows the BER performance of an orthogonal signalling scheme in AWGN for I&Q detection of duration $T_i = 1/B_{RF}$ and $T_i = 10/B_{RF}$. It is interesting to note that noncoherent demodulation with square-law devices (dashed and dash-dotted curves) perform better with longer integration times than orthogonal signalling in AWGN. This makes square-law demodulation a preferred method for IR-UWB.

3.5.2. The Need for Carrier-based IR-UWB

When considering the receiver topologies given in Fig. 3.9, it is important to note that down-conversion by means of a nonlinear device clearly lacks selectivity. Such solutions require bulky RF bandpass filters and are usually not practical for dense circuit integration. By using multiple-carriers for IR-UWB, the wireless environment accommodates a large number of users sharing different part of the spectrum, and very strong signals can coexist next to very weak ones. For the class of
devices targeted in this work (single antenna), the frequency division multiple access (FDMA) is also the only plausible diversity strategy for enabling a reliable communication link. If the wireless devices are stationary or seldom moving (e.g., for wireless devices attached to computers or to most sensors), time diversity is not a reliable option, while small size rules out spatial diversity (e.g., multiple antennas).

It is consequently evident that FDMA schemes can provide a flexible use of the spectrum masks, minimize interference to existing narrowband systems by sub-band selection and facilitate the scalability of the spectrum for wider bandwidth usage. Moreover, since each band occupies only a fraction of the FCC mask, the emitted pulses have longer duration in time compared to a single band approach, which in turn eases the implementation of signal detection and synchronization.

Another important advantage for IR-UWB devices using frequency diversity and transmitting at pulse rate above 1 Mb/s is described hereafter. FDMA or frequency hopping methods help to reduce the emitted power back-off issue illustrated in Fig. 3.7. Assuming a frequency hopping duration between channels shorter than 1 µs, the apparent pulse rate in a single channel is smaller than the overall pulse rate. This in turn reduces the power back-off in each channel used for frequency hopping and enables more effective transmitted power. Fast hopping schemes can also be useful to cancel IPI in channel with strong signal dispersion. Instead using sophisticated equalization to combat dispersion, the frequency channel can be hopped after each transmitted bit. Thus, at the receiver, directly after a synchronous carrier switching (we assume that transmitter and receiver are synchronized), the energy potentially leaking in the following bit period is rejected in the adjacent frequency band. Note that these two mechanisms described in this paragraph are not quantitatively investigated and are only given to emphasize the argumentation for the use of carrier-based IR-UWB wireless systems.

### 3.5.3. Implementation Issues for Down-conversion

Although PPM modulation shows a clear advantage in terms of complexity at the signal detection level over OOK (threshold setting) and BFSK (bandpass filter), the implementation of this scheme becomes less straightforward in “channelized” or carrier-based receivers.

Two down-conversion principles are shown in Fig. 3.11. The first one (a) uses single side-band (SSB) heterodyne mixing process and the second one (b) is based on zero-IF (ZIF or homodyne) frequency
3.5. The Modulation Scheme

translation. In both cases, quadrature dowconversion must be used; in a), quadrature mixer using Hartley or Weaver architecture \[66,67\] must be employed to remove the image frequency\(^6\), whereas in b), quadrature demodulation must be employed to process the baseband signal. Note also that the noise figure of SSB mixer is twice the noise of a double sideband (DSB) mixer.

![Figure 3.11. Down-conversion principles illustrated in frequency domain: a) SSB down-conversion with image rejection and b) zero-IF down-conversion with DC offset cancelation.](image)

Using zero-IF down-conversion techniques, a second issue appears at the baseband amplification of carrier-based PPM signals. Once mixed with a local oscillator (LO), the incoming signal is located around DC, with maximum power density at DC. In this case, offset voltages can corrupt the signal and even saturate the baseband stage. Homodyne receivers used with signals exhibiting a peak PSD at zero frequency require some means of offset cancelation. Using a simple highpass filter to remove offset requires a corner frequency 10'000 times smaller than the channel bandwidth \[68\]. For IR-UWB, it translates into a corner frequency below 25 kHz, which still requires quite large on-chip capacitors. Other solutions, such as the one using DC offset cancelation loops, do not affect the received spectrum but need idle communication time intervals to carry out the cancelation or to bring the loop to its steady-state value. The problem of offset can be alleviated using signal spectrum containing little energy near DC. This technique is particu-

\(^6\)Unfortunately, the image rejection of these mixer architectures is limited over a bandwidth of a few MHz and therefore does not meet the UWB requirement. In Hartley topologies the perfect image cancelation occurs at only one frequency. Weaver mixers perform better at the price of a higher complexity, they require a dual-frequency down-conversion receiver, which increases the complexity of the receiver.
larly suited, for example, for IR-UWB modulation using binary FSK, where several MHz around DC can be “wasted” without significant loss in performances.

3.5.4. Proposed Modulation Scheme

In the previous sections, we have seen that

- channelized architectures are preferred for a practical implementation of compact devices (Section 3.5.3);
- the use of a signalling scheme with very little energy in the center band is desired for zero-IF down-conversion (Section 3.5.3);
- a signal bandwidth in the order of 250 MHz is sufficient to mitigate small-scale fading (Section 3.4.4).

Based on these considerations, we propose the following modulation scheme using a binary frequency modulation of IR-UWB pulses. As illustrated in Fig. 3.12, a bit “0” is determined by a pulse centered at \( f_c - f_o/2 \) and a ”1” data is defined by a pulse centered at \( f_c + f_o/2 \), where \( f_c \) is the center frequency of the sub-band and \( f_o \) is the deviation frequency. This modulation also features a better mask occupancy than single-carrier signalling and allows a reduced out-of-band signal energy in order to minimize the co-channel interference.

A further advantage provided by longer pulses is that the required peak voltage at the antenna input is reduced, thus enabling an easier low voltage implementation, especially at low pulse repetition rates, where the peak voltage could be prohibitively large for deep submicron CMOS implementations. For example, when considering a single band approach using a pulse that occupies the whole frequency range from 3.1 to 4.8 GHz (BW=1.7 GHz), the half-amplitude pulse length is given by Eq. 3.20 and is approximately 710 ps. Considering a pulse rate of 1 Mp/s, a peak-to-peak RF pulse amplitude \( A_{RF} \) of 7.7 V is required at the input of a lossless 50 Ω isotropic antenna to maximize the transmitted power. This is to be compared with the typical 1.2 V (1.8V) supply voltage of current 0.13 μm (0.18 μm) CMOS technologies. The peak pulse voltage is furthermore increased when a passive and lossy shaping filter is used before the antenna in order to attenuate the unwanted emissions that may appear in the forbidden frequency bands defined by the regulation.

Consequently, the reduced peak voltage, which is allowed by the proposed modulation scheme, has a direct influence on the amount
3.5. The Modulation Scheme

Figure 3.12. Temporal and normalized energy spectrum density (ESD) representations of the proposed BFSK/AM-C modulation. A bit “0” is determined by a pulse centered at \( f_c - f_o / 2 \) and a “1” data is defined by a pulse centered at \( f_c + f_o / 2 \). In this illustration, \( f_c = 4 \) GHz and \( f_o = 250 \) MHz. The overall signal’s -10 dB bandwidth is then defined as the sum of the deviation frequency \( f_o \) plus the pulse’s bandwidth, which corresponds to 500 MHz.

of linearity required at the output stage of the transmitter, to avoid spectral regrowth that increases adjacent interferences caused by distortions, and thus on the power consumption of the radio-frequency front-end.

3.5.5. IR-UWB BFSK Signalling Parameters

An overview of the main parameters of the transmitted signal is given in Table 3.4 for two different practical pulse rates, at which the transceiver is not affected by interpulse interference (see later in Section 3.8.4 and Table 3.5).

By using BFSK, the pulse rate in a frequency band is halved and the power back-off is reduced from 10 dB to 6.9 dB (see Section 3.4.5). Figure 3.13 illustrates these effects; at 10 Mp/s, the simulated PSD is in excess of 6.9 dB due to spectral lines, those can be actually reduced to below 3 dB with the help of dithering methods (eg. by randomly choosing the polarity of the envelope, assuming a constant phase of the
<table>
<thead>
<tr>
<th>$b$ (≈ pulse length)</th>
<th>4.843 \cdot 10^{-9}$ [s]</th>
<th>4.843 \cdot 10^{-9}$ [s]</th>
</tr>
</thead>
<tbody>
<tr>
<td>$a$</td>
<td>173.2 \cdot 10^{-12}</td>
<td>244.9 \cdot 10^{-12}</td>
</tr>
<tr>
<td>$A_{RF}(50\Omega)$</td>
<td>$\sqrt{50}\frac{a}{b} = 253$ [mV]</td>
<td>$\sqrt{50}\frac{a}{b} = 358$ [mV]</td>
</tr>
<tr>
<td>$A_{RF}(100\Omega)$</td>
<td>$\sqrt{100}\frac{a}{b} = 358$ [mV]</td>
<td>$\sqrt{100}\frac{a}{b} = 506$ [mV]</td>
</tr>
<tr>
<td>$E_{s,RF}$</td>
<td>2.19 [pJ]</td>
<td>4.38 [pJ]</td>
</tr>
<tr>
<td>$P_{tx}$</td>
<td>$-16.6$ [dBm]</td>
<td>$-16.6$ [dBm]</td>
</tr>
</tbody>
</table>

RF modulating signal).

An additional equivalent loss term $L_{mod} \approx 3.3$ dB must be added in the permissible transmitted power in order to take into account this peaking effect due to the modulation (Fig. 3.14).

### 3.6. Receiver Architecture

The restricted frequency range and the steep characteristic of the spectrum masks, especially the one adopted by the European Community, prevents the use of true baseband-generated IR-UWB signals (e.g. monocycles), which are ill-defined in terms of spectrum. Thus, a certain degree of accuracy in the center frequency of the IR-UWB pulses is required at the transmitter and, consequently, at the receiver as well.

The receiver front-end thus will use a frequency conversion. During the last decade, the direct-conversion (or zero-IF) architecture has been recognized as the one having the potential for the highest integration and lowest power [68, 69]. However, from a technical point of view, direct conversion as used in conventional narrowband receivers suffers from problems such as DC-offset, local oscillator re-radiation, flicker noise and I/Q mismatch. As we will see later, these aspects are less problematic with the carrier-based IR-UWB signals. More particularly, the chosen modulation scheme helps to solve these problems by the use of a large modulation index, thus creating a signal with practically no
3.6. Receiver Architecture

Figure 3.13. Simulated and theoretical non-dithered (one-sided) PSD’s for a pulse rate of 10 MHz modulated by random data. The peaks are caused by a periodicity in the modulation of the pulses.

Figure 3.14. Simulated and theoretical dithered (one-sided) power spectral densities for a pulse rate of 10 MHz modulated by random data. The random polarity modulation helps to reduce the spectral lines by more than 3 dB.
energy in the center of the frequency band. This property has already been used in the past with some paging standards (POCSAG, FLEX) or with WLAN systems such as the IEEE 802.11b standard.

The block schematic of the receiver is depicted in Fig. 3.15. The received signal is first picked up by the antenna and eventually filtered by a preselect filter. The aim of this filter is to attenuate the strong out-of-band signals which may saturate the receiver front-end. The amount of required filtering will be investigated in Section 3.11.3. The signal is then amplified by a low noise amplifier (LNA).

![Fig. 3.15. Low-complexity UWB receiver using a direct-conversion architecture.](image)

The down-conversion and the channel selection are realized with two mixers and a voltage-controlled oscillator (VCO), whose frequency $f_c$ is the geometric mean of the two frequencies $f_c - f_o/2$ and $f_c + f_o/2$. Since the down-converted BFSK signal carries information on both sides of its spectra, the process must involve quadrature mixing, where one replica of the local oscillator or the RF signal is shifted by $90^\circ$.

This architecture is particularly adapted for a low-noise implementation since the baseband signals are centered at a frequency equivalent to the deviation frequency $f_o/2$ and have almost no DC content. This will avoid the effect of flicker noise that usually harms the performance of direct conversion narrowband receivers using CMOS technology (modern CMOS deep-submicron technologies have flicker noise corner frequencies in the MHz range). Furthermore, since the received baseband signal has actually a bandpass characteristic with a typically large lower cutoff frequency (typically above 50 MHz), the DC offset can easily be removed by the decoupling capacitors of the amplifier chain.
After the down-conversion, the signal is then filtered by a channel filter and amplified by a variable gain amplifier (VGA). The signal at the VGA output is further processed by the data demodulator/decoder. The role of the channel filter is primarily to suppress the down-converted signals originated from devices occupying the adjacent channels. The type of transfer function and the order of this filter depends mainly on the amount of suppression needed. However, in the following sections, since no adjacent signal will be considered, a Gaussian transfer function will be used in order to enable a mathematical expression of the receiver’s performance (see Section 3.7.3).

3.7. Demodulation Principle

3.7.1. Introduction

The implementation of a noncoherent BFSK demodulator as depicted in Fig. 3.9-c is not practical. On the other hand, the main goal in this thesis is to avoid high-speed ADC. Although state-of-the-art ADC’s require as low as a few milliwatts of power [70], a fast digital signal processing (DSP) device is required to process the flow of samples for synchronization and detection, thus increasing drastically the power consumption.

The basic idea of the proposed FSK demodulator was first presented by Scheaffer in 1942 [71] and is better known as the balanced quadricorrelator. The down converted BFSK signal is recovered (demodulated) from its zero-crossings. A typical implementation of the demodulator uses two pairs of cross-coupled edge-triggered sample-and-hold (S&H) blocks as depicted in Fig. 3.16. For a pair of S&H, the input signals are the I and Q baseband signals from the quadrature mixers. The demodulator is, in its principle, only a S&H driven at the CLK input (S&H input marked with a $\Delta$) by the zero-crossings of the I branch and at the analog input by the Q branch (analog signal). If we consider CW quadrature signals at the inputs, the S&H output attains one steady-state if zero-crossing transitions at the CLK input lead the zero-crossing transitions at the analog input, and the other state if they lag. This corresponds exactly to a positive or negative frequency shift of the carrier, and consequently, to the data. If both quadrature I and Q signals have the same time-varying amplitude characteristic and are perfectly 90° out-of-phase, the output of the demodulator is proportional to the envelope of the signal at the analog input. The output signal at zero-crossing instants $t_k$ of the I and Q input signal can be
Figure 3.16. Implementation of the BFSK demodulator. The principle is shown for signals with a time-varying envelope (downconverted bandpass Gaussian pulses).

mathematically expressed as

\[
s_{dem}(t = t_k) = s_Q(t|_{s_I=0\uparrow}) - s_I(t|_{s_Q=0\uparrow}) + \ldots
\]

\[
s_Q(t|_{s_I=0\downarrow}) - s_I(t|_{s_Q=0\downarrow}),
\]  

where \(\uparrow\) and \(\downarrow\) denotes signal’s raising and falling edges, respectively.

3.7.2. Outline and Methodology

In this section, we will estimate the influence of such a demodulator in the receiver chain. More specifically, the goal is to evaluate the degradation of the SNR between the input and the output of the demodulator in the presence of short carrier-modulated Gaussian pulses. Basically, we will see that the effect of such a device in the presence of noise is to reduce the equivalent output envelope of the incoming signal because of the non-optimum sample instants \(t_k\) defined by the noisy input signal. We first investigate how the zero crossing instants are affected by noise by calculating the phase statistics of the noisy input I and Q signals (Section 3.7.4, Fig. 3.19). We calculate this effect for a constant envelope input signal (Section 3.7.4, Fig. 3.20) and then estimate this effect on pulsed signals having a Gaussian envelope (Eq. 3.61). The goal is to
obtain an expression of the demodulated signal as a function of 1) the transceiver parameter and 2) the SNR at the input of the receiver. To finally obtain an estimation of BER performance in AWGN, we then translate the latter SNR into a value that represents the energy per bit $E_b$; the second step consists in the expression of the noise PSD $N_o$ at the output of the demodulator (Section 3.7.5).

### 3.7.3. Transceiver Model

The simple relationship between a Gaussian signal expressed in time and frequency domain has already been presented in 3.4.1. This formalism will allow us to derive the equations that describe the basic relation between input and output signals of Gaussian filters when the input signal is itself a Gaussian pulse. Since only the envelope of the frequency translated Gaussian pulse will be affected by the filtering operation, the following calculations will be applied to the lowpass equivalent version of the bandpass signals and filters in order to keep a clear notation.

The high-level theoretical model of the transceiver is depicted in Fig. 3.17. The transmitter consists of a source emitting Dirac pulses and a transmit filter to obtain the desired Gaussian pulse shape. In this mathematical description, the propagation channel is modeled with AWGN only. The receiver consists of the channel filter, which limits the RF input bandwidth, the quadricorrelation-based demodulator and the decoder. The latter is based on the matched filter principle. Note that in the decoder, we could also employ a rectangular function to obtain an I&D detection method, as described in Section 3.5.1. We also assume a perfect synchronization between the transmitter and the receiver.

In the following sections, the parameters $b_{tx}$, $b_c$, $b_d$ and $b_{rx}$ define the signal at the output of the transmit filter, the channel filter, the demodulator and the receive filter, respectively. Greek letters ($\beta_t$, $\beta_c$ and $\beta_{rx}$) are used to identify the transfer functions or the impulse responses of the transmit-, channel- and receive filter (see notation in Fig. 3.17). Note that since the transmit filter is excited by a dirac pulse (“source” block), the relation $\beta_{tx} = b_{tx}$ holds.

The transmitted signal $s_{tx}(t, b_{tx})$ and its Fourier transform $S_{tx}(f)$ are defined by

$$s_{tx}(t, b_{tx}) = \frac{a_{tx}}{b_{tx}} \cdot e^{-\pi \left( \frac{t}{b_{tx}} \right)^2} \quad \Rightarrow \quad S_{tx}(f) = a_{tx} e^{-\pi \left( b_{tx} f \right)^2}, \quad (3.32)$$
where $b_{tx}$ is a parameter proportional to the length of the transmitted signal. To calculate the signal $s_c(t, b_c)$ at the input of the demodulator, we first define the transfer function of the channel filter $H_c(f)$ as

$$H_c(f, \beta_c) = e^{-\pi(\beta_c f)^2}, \quad (3.33)$$

where $\beta_c$ is a parameter inversely proportional to the bandwidth of the channel filter. The signal $s_c(t, b_c)$ at the output of the channel filter is thus expressed as the multiplication of the Fourier transform $S_{tx}(f)$ of the transmitted signal $s_{tx}(t)$ by the transfer function of the channel filter $H_c(f)$:

$$S_c(f) = S_{tx}(f) \cdot H_c(f) = a_{tx}e^{-\pi[(b_{tx}f)^2 + (\beta_c f)^2]} = a_{tx}e^{-\pi(b_c f)^2}, \quad (3.34)$$

where $b_c = \sqrt{b_{tx}^2 + \beta_c^2}$. In time domain the signal $s_c(t, b_c)$ is then expressed as

$$s_c(t, b_c) = \frac{a_{tx}}{b_c} \cdot e^{-\pi\left(\frac{t}{b_c}\right)^2} = A_c \cdot e^{-\pi\left(\frac{t}{b_c}\right)^2}. \quad (3.35)$$

According to Eq. 3.14, the equivalent pulse energy $E_c$ at the output of the channel filter is then

$$E_c = \int_{-\infty}^{\infty} |s_c(t)|^2 \, dt = \frac{a_{tx}^2}{b_c \sqrt{2}} \quad (3.36)$$
and the ratio between the energy at the input and the output of the channel filter (channel filter attenuation) is simply

$$\alpha_c = \frac{E_c}{E_{tx}} = \frac{b_{tx}}{\sqrt{b_{tx}^2 + \beta_c^2}}. \quad (3.37)$$

With the help of the used formalism, we notice that the derivation of the signal parameters involves very simple calculations.

### 3.7.4. Envelope Detection Principle

The functional description of the demodulation process is illustrated in Fig. 3.18-a for a pair of S&H (raising edge). A comparator has been inserted before the triggering input to illustrate the generation of a triggering signal. We consider the noise as referred to the input of the demodulator. This noise corresponds to a band-limited Gaussian noise $n_I(t)$ and $n_Q(t)$ which is added to both the I and Q signals. The noise introduces displacement of the zero-crossings (jitter noise) that trigger the sampling instants $t_k$. This has for consequence a non-ideal sampling instants and thus an equivalent reduced average amplitude $s_{d,Q}(t)$ at the demodulator output (Fig. 3.18-b). In this section, we will first consider only one S&H device with its related blocks (highlighted in black in Fig. 3.18-a). We first focus on the effect of the noise at the triggering input (marked with a $\Delta$). The calculation of the output

**Figure 3.18.** Functional description of the BFSK demodulator based on cross-coupled S&H devices (a) and output signal (b).
amplitude degradation, therefore, demands the calculation of the statistical properties of the zero-crossings. The latter is obtained through the statistical expression of the phase of the noisy input signal and is evaluated in the next section.

**Phase Statistics**

In Fig. 3.18, the zero-crossing distribution in time domain of the signal $x_I = s_{c,I}(t) + n_I(t)$ defines the sampling instants $t_k$, which in turn influence the sampled values of the signal $x_Q(t)$. In the following development, the signal at the analog input of the S&H is considered as noise free ($n_Q(t) = 0$), that is $x_Q(t) = s_{c,Q}(t)$. Since the sampling process is a linear operation, we will analyze the effect of noise $n_Q(t)$ at the analog input separately in Section 3.7.5. This initial analysis will consider CW signals. The signal $x_I(t)$ at the comparator input of the S&H device is modeled as a sinusoidal CW plus additive bandpass filtered (channel filter) Gaussian noise $n_I(t)$ with a two-sided power spectral density $N_I(f) = \frac{N_0}{2} |H_c(f)|^2$, that is (for the in-phase component I)

$$x_I(t) = s_{c,I}(t) + n_I(t) = A_c \cdot \cos(\omega_0 t) + n_I(t), \quad (3.38)$$

where $A_c$ is the amplitude of the CW signal at the input of the demodulator, $\omega_0$ is the signal frequency and $n_I(t)$ is the band-limited noise. In this analysis of the static behavior of the demodulator output, we will assume that the noise $n_I(t)$ is narrowband in the sense that its equivalent noise bandwidth $B_n$, which is determined by the transfer function $H_c(f)$ of the channel filter, is smaller than $f_0$. With the latter assumption, $n_I(t)$ can be written in quadrature form [72]:

$$n_I(t) = n_i(t) \cdot \cos(\omega_0 t) + n_q(t) \cdot \sin(\omega_0 t), \quad (3.39)$$

where $n_i(t)$ and $n_q(t)$ are referred to as the in-phase and the quadrature (I&Q) components of $n_I(t)$, and are random processes whose spectrum is the bandpass spectrum, which has been frequency translated around zero frequency (lowpass equivalent). The envelope and phase representation of the signal $x_I(t)$ in Eq. 3.38 can be rewritten

$$x_I(t) = r(t) \cdot \cos(\omega_0 t + \theta(t)), \quad (3.40)$$

where the envelope $r(t)$ is given by

$$r(t) = \sqrt{(A_c + n_i(t))^2 + n_q^2(t)} \quad (3.41)$$
and the phase is expressed as
\[ \theta(t) = \tan^{-1} \frac{n_q(t)}{A_c + n_i(t)}. \tag{3.42} \]

Both envelope \( r(t) \) and phase \( \theta(t) \) are also lowpass, or slowly varying with respect to \( n_I(t) \). The joint probability density function (pdf) of the two variables \( r \) and \( \theta \) is given as \[ P(r, \theta) = \frac{r}{2\pi\sigma_c^2} \exp\left(- \frac{r^2 + A_c^2 - 2A_c r \cos \theta}{2\sigma_c^2}\right), \tag{3.43} \]

where
\[ \sigma_c^2 = \mathbb{E}(n_I(t)^2) \tag{3.44} \]
represents the power of the bandpass Gaussian noise \( n_I(t) \) at the demodulator input.

The pdf of \( \theta \), \( p(\theta) \), can be obtained by integrating out \( r \) in the joint pdf \( P(r, \theta) \). That is,
\[ p(\theta) = \int_0^\infty P(r, \theta) dr. \tag{3.45} \]

We first rewrite the numerator in the exponential function of Eq. 3.43 as
\[ r^2 + A_c^2 - 2A_c r \cdot \cos \theta = A_c^2 \sin^2 \theta + (r - A_c \cos \theta)^2, \tag{3.46} \]

to change (3.45) into
\[ p(\theta) = \frac{1}{2\pi\sigma_c^2} \cdot e^{-\frac{A_c^2 \sin^2 \theta}{2\sigma_c^2}} \int_0^\infty r \cdot e^{-\frac{(r-A_c \cos \theta)^2}{2\sigma_c^2}} dr. \tag{3.47} \]

Then, by letting \( z = (r - A_c \cos \theta)/\sqrt{2\sigma_c} \) and solving the integral in the above equation with the new integration variable \( z \), we obtain
\[ \int_0^\infty r e^{-\frac{(r-A_c \cos \theta)^2}{2\sigma_c^2}} dr \]
\[ = \int_{-\frac{A_c \cos \theta}{\sigma_c}}^{\infty} \left(z\sqrt{2\sigma_c} + A_c \cos \theta\right) \cdot e^{-z^2/2} \sigma_c dz \]
\[ = 2\sigma_c^2 \int_{-\frac{A_c \cos \theta}{\sigma_c}}^{\infty} ze^{-z^2} dz + \sqrt{2\sigma_c} A_c \cos \theta \int_{-\frac{A_c \cos \theta}{\sigma_c}}^{\infty} e^{-z^2} dz \]
\[ = 2\sigma_c^2 - \frac{A_c^2 \cos^2 \theta}{2\sigma_c^2} \alpha \sqrt{\frac{\pi}{2}} \text{erfc} \left(-\frac{A_c \cos \theta}{\sqrt{2}\sigma_c}\right) + \sqrt{\frac{\sigma_c A_c \cos \theta \cdot \text{erfc} \left(-\frac{A_c \cos \theta}{\sqrt{2}\sigma_c}\right)}{\sigma_c}}. \]
We finally obtain the probability function of the phase $\theta$ as a function of the noise variance $\sigma_c$ defined in Eq. 3.44:

$$p(\theta) = \frac{1}{2\pi} e^{-\frac{A^2}{2\sigma_c^2}} + \frac{A \cos \theta}{2\sqrt{2\pi}\sigma_c} \cdot \text{erfc} \left( -\frac{A \cos \theta}{\sqrt{2}\sigma_c} \right) \cdot e^{-\frac{A^2 \sin^2 \theta}{2\sigma_c^2}}$$ (3.48)

Since the value $\frac{A^2}{2}$ corresponds to the power of the CW input signal and $\sigma_c^2$ represents the noise power of $n_I(t)$, the term $\frac{A^2}{2\sigma_c^2} = \gamma_c$ is identified as the power signal-to-noise ratio at the demodulator input. Thus, the previous equation can be rewritten as

$$p(\theta) = \frac{1}{2\pi} e^{-\gamma_c} + \frac{1}{2} \sqrt{\frac{\gamma_c}{\pi}} \cdot \cos \theta \cdot \text{erfc} \left( -\sqrt{\gamma_c} \cdot \cos \theta \right) \cdot e^{-\gamma_c \sin^2 \theta},$$ (3.49)

where $\text{erfc}(\cdot)$ is the complementary error function. For large power SNR (i.e. large $\gamma_c$) and small $\theta$, the equation above reduces to a normal (Gaussian) probability distribution $N(\mu_\theta, \sigma_\theta^2)$ with zero mean $\mu_\theta$ and a variance $\sigma_\theta^2 = \frac{1}{2\gamma_c}$. The statistical expression of the phase can thus be written as

$$p(\theta) \approx \tilde{p}(\theta) = \sqrt{\frac{\gamma_c}{\pi}} \cdot e^{-\gamma_c \cdot \theta^2} = N(0, \frac{1}{2\gamma_c}).$$ (3.50)

The variance of the phase defines the stochastic behavior of the zero-crossing instants (jitter) for the sampling operation, as described in more details in the next section. Obviously, to obtain a maximum envelope at the demodulator output, the jitter on the sampling instants must be minimized and the input power SNR $\gamma_c$ must be maximized. Figure 3.19 illustrates the exact and the approximate probability distribution of the phase for two values of $\gamma_c$. We notice that for a SNR larger than 1, the distribution is well approximated by Eq. 3.50, whereas for low input SNR, the approximation still matches the exact pdf closely.

In the case of IR-UWB signals, the received signals $s_{c,Q}(t)$ and $s_{c,I}(t)$ are short amplitude-modulated pulses characterized by a time-varying amplitude $A_c(t)$ resulting in a process that is not strict-sense stationary. However, in order to estimate the signal amplitude at the demodulator output, we will consider a sufficient slowly varying function $A_c(t) = A_c$, and therefore a stationary process.
3.7. Demodulation Principle

Figure 3.19. Comparison of the exact and approximated probability density functions (pdf) of the phase $\theta$ for different power SNR $\gamma_c = 0.2$ and $\gamma_c = 2$. The exact analytical expression $p(\theta)$ is given by Eq. 3.49 and the large SNR approximation is given by $\tilde{p}(\theta)$ in Eq. 3.50.

Zero-Crossings Instants $t_k$

The $k^{th}$ zero-crossing instant $t_k$ of $s_I(t)$ satisfies the following equation

$$\omega_0 \cdot t_k + \theta = (2k - 1) \frac{\pi}{2}, \quad (3.51)$$

where $\theta$ is the random variable characterizing the phase of the signal in the presence of noise with zero mean $\mu_\theta$ and a variance $\sigma_\theta^2 = \frac{1}{2\gamma_c}$. Hence, the instant of the $k^{th}$ zero-crossing is

$$t_k = \frac{(2k - 1) \frac{\pi}{2} - \theta}{\omega_0}. \quad (3.52)$$

From Eq. (3.52) and (3.50), we calculate the distribution of $t_k$. For large $\gamma_c$ and small $\theta$, $t_k$ is a Gaussian distributed variable with a mean $\mu_{t_k}$ giving the center value of the $k_{th}$ zero-crossing and a variance $\sigma_{t_k}^2$. 

given hereafter

\[ \mu_{t_k} = \frac{(2k - 1)\pi - \mu_\theta}{2\omega_0} = \frac{(2k - 1)\pi}{2\omega_0} \quad (3.53) \]

\[ \sigma_{t_k}^2 = \frac{\sigma_\theta^2}{\omega_0^2} = \frac{1}{2\gamma_c\omega_0^2}. \quad (3.54) \]

The average amplitude at the demodulator output \( E\{A_d\} \) at sampling instant \( t_k \) is then

\[ E\{A_d\}_{|t_k} \approx \int_{-\infty}^{\infty} \tilde{p}(t_k) \cdot s_Q(t_k) dt_k \]

\[ = \int_{-\infty}^{\infty} N(\mu_{t_k}, \sigma_{t_k}^2) \cdot A_c \sin(\omega_0 t_k) dt_k. \quad (3.55) \]

Since the process is stationary, we can shift the whole function in the integral by \( \pi/2 \), \( \mu_{t_k} \) becomes zero and the \( \sin(\cdot) \) can be transformed into a \( \cos(\cdot) \). We can then solve the above integral with the help of the following relation \[74\],

\[ \int_0^{\infty} e^{-a^2x^2} \cos(bx) dx = \frac{\sqrt{\pi}}{2a} e^{-b^2/(2a)^2}, \quad (3.56) \]

with \( a = \sqrt{\gamma_c}\omega_0 \), \( b = \omega_0 \) and \( x = t_k \), the average voltage at the demodulator output can be expressed as:

\[ E\{A_d\} = A_c \cdot e^{-1/(4\gamma_c)} \quad (3.57) \]

where \( A_c \) and \( \gamma_c \) represent the amplitude of the signal and the power signal-to-noise ratio at the input of the demodulator, respectively.

The demodulator function can be seen as a down-conversion process by sampling. We use the quadrature property of the signals of the I and Q path to sample the signal at its maximum amplitude. In this configuration (raising-edge sensitive S&H devices), the signal is sampled at a frequency equivalent to the offset frequency \( f_o \).

Figure 3.20 shows a comparison between the theoretical calculation of the equivalent envelope attenuation and a Monte-Carlo simulation implemented with Matlab. The simulation has been carried out in the time domain with a sufficiently high sampling frequency.

The figure shows good agreement between simulation and theory for power SNR larger than -3 dB. As expected, for large SNR, the effect of
Figure 3.20. Comparison between simulations at different channel filter bandwidths (relative to the center frequency $f_0$, symbol markers) and approximation of the normalized output envelope (solid line) with respect to the input power SNR $\gamma_c$. A good agreement between simulations and theory is obtained for $\gamma_c$ larger than -5 dB, whereas a pessimistic estimation is obtained below this limit. The dotted line shows a numerical evaluation of the integral of Eq. 3.55 when replacing the pdf approximation $\tilde{p}(\theta)$ by the theoretical pdf $p(\theta)$ given in Eq. 3.49. The curve fits the experiment very well. Moreover, the figure shows that the original assumption on the bandwidth of the channel filter is not restrictive, in the sense that bandwidths larger than the center frequency (cross and square markers) stay in good agreement with channel filter bandwidth equal to half the pulse’s bandwidth (BW=0.5, point markers), especially at high SNR.

noise on the sampling instants is reduced. The signal is sampled at its optimum point and the output envelope corresponds to the signal amplitude $A_c$. We can interpret the SNR degradation as a non-optimum sampling of the sinusoidal signal in quadrature with the sampling signal or, equivalently, as a non ideal envelope recovery.
Remarks on I/Q Mismatch in Signal Demodulation

In this section about the demodulation principle, no mismatch in phase or amplitude in the I/Q branches has been considered, since it is known that this issue is less problematic in highly integrated direct-conversion systems using quadrature methods only for demodulation and not for image rejection [75]. However, the effects of I/Q mismatch have been evaluated by simulations in the Sections 3.10.1.

Output for Gaussian Envelope Input Signals

In the above section, we derived an analytical expression for the signal at the demodulator output under the assumption that the input signal has a constant amplitude $A_c$. The power SNR $\gamma_c$ was expressed as the ratio of the RMS signal power $A_c^2/2$ against the noise power $\sigma_c^2$. We now focus on the case of signals with a Gaussian time-varying envelope $A_c(t) = s(t) = a_c/b_c \cdot \exp [-\pi (t/b_c)^2]$. The instantaneous RMS power SNR of such signals is:

$$\gamma_c(t) = \frac{A_c^2(t)}{2\sigma_c^2} = \frac{a_c^2}{b_c^2} \left( e^{-\pi (t/b_c)^2} \right)^2 \cdot \frac{1}{2\sigma_c^2},$$

(3.58)

where we identify $\frac{a_c^2}{2b_c^2\sigma_c^2} = \hat{\gamma}_c$ as the maximum RMS SNR at $t=0$ (maximum signal amplitude). We insert the instantaneous RMS power SNR in Eq. 3.57, which can be rewritten as

$$E\{A_d\} = A_c(t) \cdot e^{-1/(4\gamma_c(t))},$$

(3.59)

in the latter equation both factor and argument of the exponential function are time varying functions. In order to simplify this expression, the argument of the exponential function is approximated by a Taylor series:

$$\frac{-1}{4\gamma_c(t)} = \frac{-b_c^2\sigma_c^2}{2a_c^2} \cdot \left| e^{-\pi (t/b_c)^2} \right|^{-2} \approx \frac{-b_c^2\sigma_c^2}{2a_c^2} \cdot \left( 1 + \epsilon \cdot 2\pi \cdot \left( \frac{t}{b_c} \right)^2 \right).$$

(3.60)

By choosing the correction parameter $\epsilon \approx 3$, the previous approximation yields less than $\pm 2$ dB of error for values of $\gamma_c(t)$ larger than -5 dB. Then, by replacing $A_c(t)$ in Eq. 3.59 by the time-varying input envelope $s_c(t)$ of Eq. 3.35, we obtain an estimation of the demodulated
Gaussian envelope at the output of the demodulator in the presence of noise and expressed as a function of the transceiver parameters $a_{tx}$, $b_c$ and the maximum RMS power SNR $\hat{\gamma}_c$ at the input:

$$E\{A_d(t)\} = s_c(t) \cdot e^{-1/(4\gamma_c(t))}$$

$$\approx \frac{a_{tx}}{b_c} \exp \left[-\frac{\pi}{4} \left(\frac{t}{b_c}\right)^2\right] \cdot \ldots$$

$$\exp \left[-\frac{1}{4} \frac{2b_c^2\sigma_c^2}{a_c^2} \left(1 + \epsilon \cdot 2\pi \cdot \left(\frac{t}{b_c}\right)^2\right)^{1/\gamma_c} \right]. \quad (3.61)$$

**Transformation of SNR $\hat{\gamma}_c$ into $E_{tx}/N_0$**

The term $\hat{\gamma}_c = \frac{a_c^2}{(2b_c^2\sigma_c^2)}$ used in the Eq. 3.61 defines the SNR after the channel filter expressed in terms of peak RMS signal power $a_c^2/(2b_c^2)$ and noise power $\sigma_c^2$. To enable comparison with other modulation schemes, we have to convert this expression into a value representing the classical SNR in terms of energy per bit $E_b$ over noise spectral density $N_0$, i.e. $E_b/N_0$, at the input of the receiver chain, before the channel filter. For $E_b = E_{tx}$, we obtain from Eq. 3.37 and by considering a frequency translated signal around $f_0$ ($E_c = a_c^2/(2\sqrt{2}b_c)$):

$$\frac{a_c^2}{2b_c^2} = \frac{\sqrt{2}a_c E_{tx}}{b_c}. \quad (3.62)$$

Regarding the noise power $\sigma_c^2$ at the output of the channel filter, we can rewrite it as a function of the single-sided PSD noise $N_0$ at the channel filter input:

$$\sigma_c^2 = N_0 \cdot \int_0^\infty |H_c(f - f_0, \beta_c)|^2 df = \frac{N_0}{\sqrt{2}\beta_c}. \quad (3.63)$$

Consequently, we can rewrite the peak RMS power SNR $\hat{\gamma}_c$ at the demodulator input as a function of the bit energy SNR $E_{tx}/N_0$ at the channel filter input and design parameters $b_{tx}$ and $\beta_c$:

$$\hat{\gamma}_c = \frac{a_c^2}{2b_c^2\sigma_c^2} = 2 \frac{\alpha_c b_c}{b_c} \cdot \frac{E_{tx}}{N_0} = \frac{2b_{tx}\beta_c}{b_{tx}^2 + \beta_c^2} \cdot \frac{E_{tx}}{N_0}. \quad (3.64)$$

By using the adopted formalism, the signal at the demodulator output $s_d(t, b_d)$ can be written as

$$s_d(t, b_d) = \frac{a_d}{b_d} \cdot e^{-\pi(\frac{t}{\tau_d})^2}, \quad (3.65)$$
where

\[ a_d = a_{tx} \alpha A \alpha = a_{tx} \cdot \frac{1}{N_0 \beta_c} \cdot \frac{1}{\sqrt{1 + \frac{\epsilon b_{tx}^2 + \beta_c^2}{4 b_{tx} \beta_c} \frac{N_0}{E_{tx}}}}. \]  

(3.66)

\[ b_d = b_c \cdot \alpha_b = \sqrt{b_{tx}^2 + \beta_c^2} \cdot \frac{1}{\sqrt{1 + \frac{\epsilon b_{tx}^2 + \beta_c^2}{4 b_{tx} \beta_c} \frac{N_0}{E_{tx}}}}. \]  

(3.67)

and are exclusively written in terms of the following design variables:

- \( a_{tx}, b_{tx} \), the input pulse amplitude and length parameters, respectively;
- \( \beta_c \), the channel filter bandwidth;
- \( E_{tx}/N_0 \), the bit SNR at the channel filter input,
- \( \epsilon \approx 3 \), a fitting parameter.

We notice that the pulse at the output of the demodulator suffers from two types of distortion. First, the amplitude \( A_d = a_d/b_d \) of the demodulated pulse decreases as the bit SNR \( E_{tx}/N_0 \) increases. The factor comes from the exponential function in Eq. 3.66 \( (\alpha_A) \) which is unity for bit SNR increasing towards infinity. The second type of distortion affects the equivalent pulse length \( b_d \). We observe in Eq. 3.67 that the input pulse length \( b_c \) is divided by the square root of a positive value \( (\alpha_b) \), making the equivalent output pulse length shorter with decreasing values of bit SNR. From the equation above, we obtain the energy of the demodulated pulse:

\[ E_d = \frac{a_d^2}{b_d \sqrt{2}} = \frac{a_{tx}^2}{b_{tx} \sqrt{2}} \cdot \frac{1}{\sqrt{1 + \frac{\epsilon b_{tx}^2 + \beta_c^2}{4 b_{tx} \beta_c} \frac{N_0}{E_{tx}}}}. \]  

(3.68)

### 3.7.5. Noise PSD at the S&H Output

In Section 3.7.4 (Fig. 3.18), we considered the effect of a noisy signal \( x_I(t) \) only on the sampling instants and assumed a noiseless signal \( x_Q(t) \). In this section, we analyze how the noise on the latter signal influences the receiver performances and more specifically, how \( x_Q(t) \)
3.7. Demodulation Principle

is affected by the S&H operation triggered by the noisy signal $x(t)$. The S&H process is a linear operator; it consists of an ideal sampling operation followed by a hold function. The model of the S&H block is depicted in the Fig. 3.21. We will consider the case of a high SNR input where the sampling instants can be considered as almost free from jitter. In this case the equivalent transfer function of the sampling process is simply a comb of Dirac functions.

The equivalent noise at the input of the S&H is a nonwhite AGN noise (i.e. an AWGN filtered by the channel filter). With these assumptions, the sampling operation gives rise to many copies of the input noise spectrum, which are partly superimposed on one another, depending on the channel filter bandwidth. The frequency-shifted equivalent noise spectra are considered uncorrelated with each other; hence, power spectra rather than voltages are added to compute the equivalent output noise PSD.

At this level, the concept of an effective oversampling factor $n_s$ is introduced. This factor increases the effective sampling frequency and is determined by the effective number of S&H devices used in the demodulator. In Fig. 3.16 where a practical implementation of the demodulator is presented, four S&H devices are used. Two of them are sensitive to the raising edges of signals, and the other two trig on falling edges. Due to the $90^\circ$ phase shift between each S&H the effective oversampling factor $n_s$ is 4. If the configuration depicted in Fig. 3.18 is used, $n_s$ is then equal to 2. The equivalent noise output PSD resulting from the sampling process and the effective oversampling factor is depicted in Fig. 3.22 (dotted line). The noise PSD at the input of the demodulator is given for reference (dashed line).

The hold function needs a hold time $T_h$ equivalent to $\frac{1}{n_s f_0}$. The hold function is equivalent to a lowpass filter having a $sinc$ transfer function. We define the theoretical noise spectral density $N_d(f)$ at the demod- 

![Figure 3.21. Model of the sample-and-hold device.](image)

\[ n_i(t) \quad \text{Hold} \quad n_o(t) \]

\[ \text{sampling operation} \]
Chapter 3: Transceiver Planing

Figure 3.22. Approximation of the noise PSD at the receive filter input for $b_{tx} = 2\beta_c$ and an effective oversampling factor $n_s = 4$. The sampling operation at $n_s \cdot f_0$ downconverts the bandpass noise around DC and gives rise to many copies of the bandpass noise at frequency offsets $k \cdot n_s \cdot f_0$, $k \in \mathbb{N}^*$. The error between the equivalent power defined by the exact transfer function of $N_d$ (continuous line) and the approximation used for the matched filter (dashed line) is less than 0.05 dB. The dashed line represents the noise psd at the output of the channel filter for an AWGN input.

The noise psd $N_d(f)$ is thus calculated as

$$N_d(f) = T_h^2 \text{sinc}^2(fT_h) \frac{N_0}{2} \sum_{k=-\infty}^{\infty} |H_c(f - n_s \cdot k \cdot f_0)|^2. \quad (3.69)$$

The noise psd at the output of a quadruple S&H demodulator ($n_s = 4$) for $b_{tx} = 2\beta_t$ and $f_o = 150$ MHz is illustrated in Fig. 3.22 (continuous line). Owing to the linear property of the sampling process, the signal at the output of the demodulator obtained by superimposing the signal $s_d(t)$ of Eq. 3.66-3.67 with a AGN whose spectral density is given by Eq. 3.69. The latter expression can be advantageously approximated by the function that describes the noise psd at the output of the channel filter, as shown in Fig. 3.22. Since the hold function is a lowpass function that attenuates the higher spectral components of
the noise psd, we make the following assumption
\[ N_d(f) \approx \frac{N_0}{2} |H_c(f)|^2. \] (3.70)

This approximation is valid for channel filter bandwidth being smaller than the effective sampling frequency. For larger channel filter bandwidths, the noise at the demodulator output is affected by aliasing and shaped by the hold function.

3.7.6. Bit Error Rate in an AWGN Channel

In order to evaluate the performance of the overall receiver chain, a matched filter implementation for the binary data decoder (“decoder” block in Fig. 3.17) is used in this section. It is well known\cite{73} that the response \( H_r(f) \) of the optimum filter for a real signal corrupted by nonwhite additive Gaussian noise (AGN) has a transfer function that is the transfer function for the matched filter under AWGN conditions (i.e., matched to the signal set \( \pm s_d(t) \) at the input of the binary data decoder) divided by the spectral density of the noise. It follows that
\[
H_r(f) = \eta \frac{H_d(f) \cdot H_c(f)}{N_d(f)},
\] (3.71)

where \( \eta \) is an arbitrary factor used to normalize the output of the filter. Therefore, (3.71) becomes
\[
H_r(f) = \eta \frac{H_d(f) \cdot H_c(f)}{|H_c(f)|^2} = \eta \frac{H_d(f)}{H_c^*(f)} \tag{3.72}
\]

for real filter implementations. For our case using Gaussian filters, the parameter \( \beta_r \) defining the receive filter \( H_r(f) \) can simply be expressed as
\[
\beta_r = \sqrt{b_d^2 - \beta_c^2}. \tag{3.73}
\]

Actually, we’ve seen in the previous sections that the parameters \( a_d \) and \( b_d \) defining the demodulated signal are dependent on the input bit SNR \( E_{tx}/N_0 \). Thus, the determination of the optimum receive filter \( H_r(f) \) requires the a priori knowledge of this SNR (cf. baseband). Hence, \( \beta_r \) should be adaptive and has to be rewritten as:
\[
\beta_r = \sqrt{b_d^2 - \beta_c^2} = \sqrt{b_c^2 \alpha_b^2 - \beta_c^2} = \sqrt{\alpha_b^2 b_{tx}^2 + \beta_c^2 (\alpha_b^2 - 1)} \tag{3.74}
\]
Figure 3.23. Theoretical BER according to Eq. 3.75 (continuous line) and simulated BER implemented with Matlab (point markers). Both agree very well except in the low SNR region where the model is less accurate. For comparison, the BER curve of the classical FSK non-coherent demodulator is shown (dashed curve).

We know from Eq. 3.67 that the parameter $\alpha_b$ goes to unity for increasing bit SNR, meaning that for very large SNR, $\beta_r \approx b_{tx}$. At first sight, this seems in contradiction with the fact that an optimum receiver should have an impulse response matched to the received signal (characterized by $b_c$ at the demodulator input), but we have to keep in mind that the noise is “colored” by the channel filter. Consequently, at the demodulator input, this noise intrinsically contains a multiplication by $1/H_c(f)$, that doesn’t appear in the final expression of the receive filter $H_r(f)$, when the bit SNR is increasing towards infinity.

With these assumptions, the noise component at the output of the receiver filter has been whitened and has now the characteristic of AWGN [73]. The probability of making an error in the decision process may thus be approximated to

$$P_E \approx \frac{1}{2} \cdot \text{erfc} \left( \sqrt{\frac{E_d}{N_0}} \right),$$

(3.75)

where $E_d$ is given by Eq. 3.68 and is dependent on the SNR at the input of the channel filter $E_{tx}/N_0$. 
Fig. 3.23 shows both theoretical and simulated BER for the AWGN channel with respect to $E_{tx}/N_0$. We notice that in order to achieve a reasonable BER of $10^{-3}$, the demodulator require about 11 dB of signal-to-noise ratio at the channel filter input. With a near-optimum matched filter ($\beta_r = b_{tx}$), it can be observed that the proposed demodulator used with Gaussian IR-UWB pulse does not significantly degrade the performance with respect to the theoretical optimum noncoherent BFSK scheme. This has an interesting advantage in terms of implementation. The proposed modulation only requires a reduced complexity direct-conversion approach owing to the DC-free spectrum and the high cut-off frequency allowed for the baseband strip. The receiver can be thus easily “channelized”. It also eases the synchronization owing to a demodulator output featuring a periodic baseband pulse.

3.8. Demodulation in UWB Channels

3.8.1. Integrate-and-Dump Receive Filter

The propagation of the IR-UWB signal in indoor environment is affected by the reflection against wall and obstacles. In this case, the condition of AWGN channel no longer applies and the optimum receiver filter cannot simply be defined by Eq. 3.74.

A typical simulated realization of the channel impulse response $h_{CM3}(t)$ is generated using the channel model CM3 (indoor office LOS) and shown in Fig. 3.24. Let $s_t(t)$ denote the envelope of the transmitted pulse (gray curve). After multipath propagation, the received waveform is given by the convolution of $s_t(t) \cdot \cos(2\pi(f_c \pm f_0))$ with the physical channel realization $h_{CM3}(t)$. The eye diagram of the demodulated output signal $s_d(t)$ is depicted at a SNR of 25 dB in Fig. 3.24 by thin black lines.

The multipath corrupts the received signal in an unpredictable manner. An optimum detection method would require an specific operation, which searches for separable multipath components of the received signal. In Fig. 3.24, it would consist in looking for the signal peaks and tracking them separately\(^7\). A preferred solution for a simple and practical implementation is the use of a rectangular function for $h_r(t)$. Such a solution is also known as “integrate-and-dump” (I&D) or “energy collection” and is defined by its integration time $T_i$.

\(^7\)Demodulators of this type are called “RAKE” receivers.
Chapter 3: Transceiver Planning

3.8.2. Optimum Integration Time

The optimum integration time $T_{i,\text{opt}}$ has been investigated with the help of simulations on the different statistical indoor channel models. Figure 3.25 shows the $T_{i,\text{opt}}$ for the different indoor channels CM1-CM4, CM7 and CM8. These simulations do not consider any intersymbol interference and only target the optimum determination of $T_i$. The results show that the optimum window $T_i$ length (round markers with channel indication) lies between 15 and 40 ns, with the exception for the indoor channel CM8, which shows an optimum integration time around 160 ns. However, by choosing an integration time of $T_i = 40$ ns for CM8, an acceptable reduction of the sensitivity of 2.2 dB (from 16.3 to 18.5 dB) is observed. This value thus accommodates the different multipath indoor scenarios nicely without the need of signal post-processing.

3.8.3. Other Multipath Effects

Since the incoming RF pulses are at a different center frequency due to BFSK, they do not experience exactly the same multipath conditions. The phenomenon can be observed in time domain in Fig. 3.24, where...
Figure 3.25. Optimum integration time $T_{i,\text{opt}}$ and maximum IPI-free pulse repetition rate $\text{PRR}_{\text{max}}$ for different indoor LOS (black curves) and NLOS (gray curves) scenarios. The round markers locate the optimum $T_{i,\text{opt}}$ for the six considered indoor multipath scenarios. The values $T_{i,\text{opt}}$ have been obtained by averaging the optimum integration times of 20 channel realizations. The longer integration time required by CM8 (indoor NLOS industrial) can be explained by Fig. 3.3, which shows that the RMS delay spread is 89 ns for the considered channel model. Choosing $T_i$ much smaller than this value only collects a fraction of the energy and consequently increases the BER.

We observe that the large bandwidth used at the transmission implies that the signal does not suffer from deep fades as it is the case for narrowband communications, thus the standard deviation of the
Chapter 3: Transceiver Planning

Figure 3.26. Simulated BER for channel model CM1 (100 realizations). The worst case BER curve intercepts the $10^{-3}$ BER value at an SNR of 20.6 dB. The bottom figure provides a histogram of the sensitivity. The 1.5 dB standard deviation can be understood by the ability of the UWB signal to resolve the dense multipath (no deep fade in the detected signal).

Figure 3.27. Simulated BER for channel model CM7. The worst case BER curve intercepts the $10^{-3}$ BER value at an SNR of approximately 29 dB (channel realization r10). The bottom figure provides a histogram of the sensitivity.
sensitivity is quite small (1.5 dB and 2.1 dB simulated for CM1 and CM7, respectively). This small standard deviation can be understood by the ability of the UWB signal to cope with dense multipath indoor channels [58]. This is illustrated in the sensitivity histograms of the Fig. 3.26 and Fig. 3.27. Examples above have been simulated with a integration time of 20 ns. In the worst cases, the required SNR to reach a BER of $10^{-3}$ are 20.6 dB and 22.8 dB for channel model CM1 and CM7, respectively.

### 3.8.4. Effect of Interpulse Interference (IPI)

In the previous section, the optimum integration time under different multipath channels has been determined. Based on this approach, one may think that the maximum achievable pulse rate is equivalent to the inverse of integration time $T_i$. However, at the maximum bit rate, the residual energy that is not used for the integration spreads over the following pulses and results in IPI. Since the proposed receiver targets a low complexity implementation and avoids the uses of high-speed ADC and digital post-processing, no filter or channel equalization can be applied to mitigate IPI. The only means to cope with this effect is the reduction of the bit rate to a value that is not affected by IPI. As an example, we investigate this effect in detail in Fig. 3.28 for channel CM3 and integration time $T_i = 20$ ns. The latter value fixes the maximum pulse repetition rate to 50 MHz. At this speed, we observe that 20% of the multipath realizations cannot be resolved due to IPI (rightmost bar in the histogram). For these cases, the IPI bounds the BER to a minimum value that does not intersect with the chosen reference of $10^{-3}$.

**Table 3.5. Unresolved realizations due to IPI**

<table>
<thead>
<tr>
<th>Pulse rate</th>
<th>Channel models</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>CM1</td>
</tr>
<tr>
<td>40 MHz</td>
<td>19%</td>
</tr>
<tr>
<td>20 MHz</td>
<td>1%</td>
</tr>
<tr>
<td>10 MHz</td>
<td>0%</td>
</tr>
<tr>
<td>5 MHz</td>
<td>0%</td>
</tr>
</tbody>
</table>

Exhaustive simulations in Table 3.5 show that the maximum pulse rate that ensures a complete resolution of indoor multipath channel
without sophisticated channel equalization is 20 MHz for CM3, 10 MHz for channels CM1, CM2, CM4 and CM7 and 5 MHz for channel CM8. This value, as well as the optimum integration times defined in Section 3.8, will serve as the basis for the specification of the transceiver. By choosing a pulse repetition period five times longer than the integration time i.e.,

\[ T_b = 5T_i, \]

we can optimally cover the different channel scenarios in terms of sensitivity and maximize the pulse rate without any recourse to sophisticated channel equalization. With the aforementioned relationship between \( T_i \) and \( T_b \), the following sensitivity degradation \( \Delta_{Sens.} \) can be expected from Fig. 3.28:

- CM1: \( T_b = 100 \text{ ns}, \ T_i = 20 \text{ ns} \ (T_{i, opt} = 19 \text{ ns}), \rightarrow \Delta_{Sens.} \approx 0 \text{ dB}; \)
- CM2: \( T_b = 100 \text{ ns}, \ T_i = 20 \text{ ns} \ (T_{i, opt} = 34.6 \text{ ns}), \rightarrow \Delta_{Sens.} \approx 0 \text{ dB}; \)
3.9. Link Budget

We see that for most of the scenarios, the rule given by Eq. 3.76 leads to negligible losses in the sensitivity of the receiver. On the other hand, such a technique provides a simple adaptive method to the channel scenario by simply changing the pulse repetition rate to avoid the effect of IPI.

3.9. Link Budget

To evaluate the performance of the system, we consider the nominal system configuration, i.e. a pulse transmission rate \( R_p \) fixed at 10 Mp/s with an integration time \( T_i = 20 \) ns. As seen in the previous section, the integration time \( T_i \) is close to an optimum and the chosen pulse rate prevents the system from being corrupted by intersymbol interference (ISI) in the case of an indoor LOS link with multipath (Table 3.5).

The link margin (LM) is a figure of merit for a communication link and express the extent (in dB) by which normal working SNR exceed a threshold SNR at which the link is deemed unusable or unreliable. For our application, the nominal condition of operation corresponds to a communication distance of 10 m and a bitrate fixed at 10 Mb/s. The threshold is defined at a BER of \( 10^{-3} \) (uncoded, i.e. without error correction algorithm or any processing gain). The link margin \( LM \) in dB is thus defined as :

\[
LM = SNR_{|RX@10m} - SNR_{|10^{-3}BER}
\]

where \( SNR_{|RX@10m} \) is the RF signal-to-noise ratio at 10 m distance and is determined by

\[
SNR_{|RX@10m} = 10 \log \frac{E_b}{N_o}_{|RX@10m} = \frac{P_{tx}}{G_P(d,f)} - 10 \log R_p - N_o,
\]

where \( P_{tx,nd} \) is the average emitted power and is obtained by Eq. 3.24 and \( G_P(d,f) \) is the path gain and is given by Eq. 3.9 and Table 3.1. \( R_p \) is the pulse rate and \( N_o \) represents the noise PSD at the receiver.
(in dBW/Hz) and is related to the noise figure NF of the receiver as follows:

\[ N_o = 10 \log kT + NF = -204 + NF. \]  

(3.79)

### 3.9.1. Performances in Free-space with Optimal Filter

By assuming free-space propagation conditions in Channel A (center frequency of \( f_o = 3.45 \) GHz), a receiver noise figure of 6 dB and a tolerated BER of \( 10^{-3} \) for which a SNR of 10.5 dB is required (Fig. 3.23), a link margin of 7.2 dB can be reached at a Tx-Rx distance of \( d = 10 \) m. For \( LM = 0 \), this corresponds to a distance of 22.9 m. This range is estimated by

\[ D|_{BER=10^{-3}} = d_{\text{nom}} \cdot 10^{LM/n/10}, \]

where \( d_{\text{nom}} = 10 \) m is the nominal distance for link margin calculation and \( n \) is the propagation exponent of the considered channel (Table 3.1).

### 3.9.2. Performance with I&D Filter

It is interesting to note that the use of an I&D receive filter in free space condition leads to a loss of sensitivity of approximately 2 dB. The sensitivity calculated in Section 3.7.6 transforms into 12.5 dB at a \( 10^{-3} \) BER, which corresponds to a link margin of 5.2 dB and a Rx-Tx distance reduced from 22.9 m to 18.2 m for \( LM = 0 \). An overview of the link budget in free space is given at the end of this section in Table 3.6. In multipath channels, we have seen in Section 3.8.3, that in the worst cases, the required SNR to reach a BER of \( 10^{-3} \) are 20.6 dB and 22.8 dB for channel model CM1 and CM7, respectively. Assuming propagation coefficients as defined in Table 3.1, this corresponds to worst-case transmitter-to-receiver distance of approximately 12.4 m for CM1 and 2.4 m for CM7.

### 3.9.3. Summary

The same analysis is performed for the last indoor LOS CM3 channels and summarized in Table 3.6. At a pulse rate of 10 MHz, the average achieved range is well above the targeted 10 m. All the scenarios except CM7 exhibit a worst case communication range larger than 10 m.
3.10. Implementation Losses

Table 3.6. Estimated link budget in free space and LOS UWB channels for nominal parameters: $f_c = 3.45$ GHz (Channel A), $BR = 10$ Mpulses/s, $T_i = 20$ ns, receiver noise figure $NF = 6$ dB ($G_P$ is the path gain at $d = 10$ m).

<table>
<thead>
<tr>
<th>Conditions</th>
<th>Free space $n = 2$</th>
<th>CM1 $n = 1.79$</th>
<th>CM3 $n = 1.63$</th>
<th>CM7 $n = 1.2$</th>
</tr>
</thead>
<tbody>
<tr>
<td>$P_{tx}$</td>
<td>-46.6 dBW</td>
<td>-59.1 dB</td>
<td>-49 dB</td>
<td>-66.0 dB</td>
</tr>
<tr>
<td>$G_P$</td>
<td>-63.7 dB</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>$10 \log_{10} R_p$</td>
<td>-70 dB</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>$E_{rx}$</td>
<td>-180.3 dBJ</td>
<td>-175.7 dBJ</td>
<td>-165.6 dBJ</td>
<td>-182.6 dBJ</td>
</tr>
<tr>
<td>$N_o$</td>
<td>-198 dBW/Hz</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>$SNR_{RX@10m}$</td>
<td>17.7 dB</td>
<td>22.3 dB</td>
<td>32.4 dB</td>
<td>15.4 dB</td>
</tr>
<tr>
<td>$SNR_{10^{-3}\text{BER}}$</td>
<td>12.5 dB</td>
<td>13.7/20.6 dB</td>
<td>14.5/19.3 dB</td>
<td>14.6/22.8 dB</td>
</tr>
<tr>
<td>LM (10m)</td>
<td>5.2 dB</td>
<td>8.6/1.7 dB</td>
<td>17.9/13.1 dB</td>
<td>0.8/-7.4 dB</td>
</tr>
<tr>
<td>Range</td>
<td>18.2 m</td>
<td>30.2/12.4 m</td>
<td>125/63 m</td>
<td>11.7 / 2.4 m</td>
</tr>
</tbody>
</table>

Practically, this distance (or the link margin) will decrease due to other losses caused for example by non-perfect synchronization, demodulator implementation losses, I/Q branch mismatch, frequency offset between transmitter and receiver or non free space propagation channels. These effects will be investigated with the help of simulations in the following section.

3.10. Implementation Losses

In this section, we investigate with the help of numerical simulation the losses caused by potential implementation impairments. This will help to refine the specification of the transceiver.

3.10.1. I/Q mismatch

One of the main source of performance degradation in a quadrature receiver is the mismatch between the I and Q paths. Mismatches in amplitude and phase both reduce the signal amplitude at the demodulator output, which turns into a reduced SNR and consequently, into a loss of sensitivity. I/Q mismatch can occur either in the baseband strip (mixers, channel filter, VGA and demodulator) or in the quadrature RF local oscillator. At the system level, we investigate the effect of amplitude and phase mismatch on the sensitivity degradation by means
of numerical simulations. For that purpose, the signals in the I and Q branch, described by Eq. 3.38 as an example for the I-branch, are modified as follows:

\[ x_I(t) = \left(1 + \frac{\epsilon_A}{2}\right) \cdot r(t) \cdot \cos(\omega_0 t + \theta(t) + \frac{\Delta \theta}{2}), \]
\[ x_Q(t) = \left(1 - \frac{\epsilon_A}{2}\right) \cdot r(t) \cdot \sin(\omega_0 t + \theta(t) - \frac{\Delta \theta}{2}), \] (3.81)

where \( \epsilon_A \) is the amplitude imbalance and \( \Delta \theta \) is the phase imbalance between path I and Q. Note that the amplitude mismatch is applied on the noisy signal, since the mismatch mainly occurs in the mixer and the baseband stage. Simulation results are given in Fig. 3.29 for amplitude and phase mismatch. For a sensitivity degradation smaller than 0.5 dB, the phase imbalance must satisfy \( \Delta \theta < 20^\circ \) and the amplitude mismatch \( \epsilon_A \) should be kept smaller than 0.4 (40%).

![Figure 3.29. Simulated receiver’s sensitivity degradation (at BER= 10\(^{-2}\)) against amplitude mismatch \( \epsilon_A \) (left-hand plot) and phase mismatch \( \Delta \theta \) (right-hand plot).](image)

3.10.2. BER Degradation due to Channel Filter

The channel filter is basically a lowpass filter, whose function is to reject the out-of-band down-converted signals coming from adjacent channels and/or from strong continuous wave interferers. In this section, the influence of the corner frequency has been simulated in order to evaluate its impact on the sensitivity of the overall reception. The results for a 5\(^{th}\) order Tchebychev filter approximation are given in Fig. 3.30. We
assume here that both filters of the I and Q path have the same cutoff frequency deviation caused by process variations. A mismatch between the cutoff frequencies or the gains have been considered in the previous Section 3.10.1 under the form of amplitude and phase imbalance. The

![SNR degradation vs. Channel Filter Bandwidth](image)

**Figure 3.30.** Simulated sensitivity degradation (at BER=$10^{-2}$) against the corner frequency of the 5th-order Chebyshev channel filter. A variation of ±10% of the corner frequency leaves the sensitivity degradation below 0.2 dB. This value defines a relaxed tolerance on the corner frequency and will enable the implementation of a simple calibration method (see Section 4.5.2).

The optimum corner frequency lies around 200 MHz and is related to the pulse bandwidth. For corner frequencies higher than this value, a noise excess occurs at the demodulator input, which results in a reduction of the BER performance. Whereas for corner frequencies below 200 MHz, the same effect occurs owing to an excessive attenuation of the incoming signal.

### 3.10.3. BER Degradation due to a Frequency Offset

The large bandwidth of IR-UWB RF signals somewhat relaxes the required accuracy in terms of instantaneous frequency for both the transmitted signal and receiver’s LO. The influence of a frequency offset between the transmitter and the receiver has been investigated by means of simulations. On the x-axis of Fig. 3.31, we report the absolute value
of the frequency offset in MHz and, on the y-axis, the equivalent SNR increase required to reach a BER of $10^{-3}$ is represented. We observe that a frequency offset of $\pm 30$ MHz between the transmitter and the receiver increases the sensitivity by approximately 0.5 dB. This offset value of $\pm 30$ MHz will be used as an upper bound to specify the carrier generation device of the transceiver.

![Graph showing the influence of frequency offset on sensitivity degradation at BER=10⁻³ dB](image)

**Figure 3.31.** Influence of a frequency offset between the transmitter and the receiver. An offset frequency of $\pm 30$ MHz increases the required SNR at the receiver by about 0.5 dB for the same BER ($10^{-3}$).

### 3.10.4. BER Degradation due to LO Phase Noise

The goal of this section is to evaluate the impact of the LO phase noise on the demodulation process. We will first describe qualitatively the effect of the phase noise on the frequency down-conversion and the demodulation and then, we will introduce a method to include phase noise in high-level system simulations and compute its effect on BER.

**Phase Noise Effect**

The effect of the phase noise on the down-conversion and the demodulation is multiple. We first consider the case of a non-ideal LO whose spectrum is degenerated by phase noise. The phase noise actually widen
the LO signal as illustrated in the left-hand part of Fig. 3.32. We assume that the ideal LO (dashed line) and the noisy LO (continuous line) have the same overall power. The mixing process of the noisy LO with the original IR-UWB RF signal corresponds to a convolution in frequency domain. The resulting signal at baseband is actually distorted as sketched by continuous lines in the right-hand part of Fig. 3.32. The consequences in a direct-conversion topology are presented hereafter. First, the spectrum experiences widening towards high frequencies. This high frequency part may actually fall out of the channel (gray-shaded area) and is filtered, thus reducing the received power. Second, the signal may extend beyond DC towards negative frequencies. This effect is similar to aliasing, this part of the pulse spectrum is folded on positive frequencies. Depending on the demodulation method used, this effect may reduce the pulse overall power due to the destructive phase interference occurring between the folded spectrum and the pulse. The third effect appears in-band and has the form of a pulse spectrum distortion caused by the convolution.

![Figure 3.32. LO phase noise effect at frequency down-conversion.](image)

The second main effect of phase noise appears at the I/Q demodulation process. This effect comes from the (correlated or uncorrelated) noise that may appears between the I and Q path of the oscillator, and has merely a similar effect than the noise coming from the signal path (LNA, mixers).

The phase noise is characterized by the spot noise power $L(\Delta f)$ in a 1 Hz bandwidth at an offset $\Delta f$ from the carrier frequency $f_c$ and
normalized to the carrier power:

\[ \mathcal{L}(\Delta f) = 10 \cdot \log \left( \frac{P_{\text{sideband, 1 Hz}}(\Delta f)}{P_{\text{carrier}}} \right) \]  

(3.82)

**Phase Noise Modelling for High Level Simulation**

The phase noise profile \( \mathcal{L}(\Delta f) \) depends mainly on the way the LO is generated (free-running or PLL-driven). Nevertheless, the large frequency offsets that take place in IR-UWB frequency down-conversion will allow some simplifications in the modelling of phase noise and this independently of the LO generation method.

We will assume that the perturbations responsible for the phase noise of the local oscillator are mainly caused by white noise and flicker stochastic processes [76]. Flicker noise effect in oscillators occurs at close offsets frequencies, typically below the flicker noise corner of the technology [77] and depends mainly on the design symmetry and the size of the transistors used in the oscillator. Furthermore, in PLL, the flicker noise of a VCO can be removed. On the other hand, at large offset frequencies, i.e. above the PLL bandwidth \( f_{\text{pll}} \), the noise of the VCO dominates and the oscillations can be considered as free-running. Demir et al [78] showed that for a free-running oscillator perturbed only by white noise sources the spot noise power is

\[ \mathcal{L}(\Delta f) = \frac{cf_c^2}{(f_{\Delta})^2 + \Delta f^2}, \]  

(3.83)

where \( f_c \) is the carrier frequency, \( c \) is a parameter characterizing the noise magnitude and \( f_{\Delta} = \pi cf_c^2 \) the corner frequency also known as the oscillator “linewidth”. \( f_{\Delta} \) does not exceed a few Hz for practical oscillators. At several MHz frequency offsets \( \Delta f \gg f_{\Delta} \), Eq. 3.84 can be rewritten as

\[ \mathcal{L}(\Delta f) \approx \frac{cf_c^2}{\Delta f^2}. \]  

(3.84)

Our approach will use a PLL to allow precise frequency generation. Since the frequency offset \( f_o = 150 \) MHz of the BFSK scheme is larger than the PLL bandwidth, the IR-UWB signal experiences a phase noise having a -20 dB/dec slope.

The temporal modelling of the phase noise has been already extensively investigated in the work of Kasdin [79]. In this work, the noise
is generated as a random sequence with a PSD characterized by a particular power law $1/f^\alpha$. An other work [80] makes use of the theory of non-linear dynamics to generate noise sources with desirable characteristic in time domain. These previous references are however difficult to implement and require sophisticated maths and time-consuming processes. Event-driven methods [81] are much more optimal in term of computer resources but are not simple to integrate in a complete system simulation. We will use here a method based on the works of Le Brun [82] and Bur-Guy [83], which consist in the discrete reconstruction of the SSB noise profile in the frequency domain by the sum of $N$ uncorrelated sinusoids. The angular frequency $\omega_i$ and the amplitude $\eta_i$ of these sinusoids are chosen to match the desired SSB spectrum, while the phase $\phi_i$ is uniformly chosen in $[-\pi; \pi]$. The LO can be described as a voltage signal

$$v(t) = V_c \cdot \cos (\omega_c t + \phi(t))$$

$$= V_c \cdot \cos \left( \omega_c t + \sum_{i=0}^{N} \eta_i \sin (\omega_i t + \phi_i) \right),$$

where $V_c$ and $\omega_c$ are the amplitude and the angular frequency of the carrier, respectively. An example of a synthesized noisy LO is given in Fig. 3.33. The targeted PSD profile (black thick line) is similar to the one obtained with a PLL with a bandwidth $f_{\text{PLL}}$. The light gray points represent the PSD obtained over a length of 1 $\mu$s ($\Delta f_{\text{min}} = 1$ MHz). The thin dark gray curve illustrates the averaged $L(\Delta f)$ and the convergence to the targeted phase noise. The gray shaded area locates the 200 MHz bandwidth IR-UWB pulse centered around $f_0$.

**Simulation Results**

The phase noise requirement is given by fixing the maximum allowed SNR degradation for a given BER of $10^{-3}$. In Fig. 3.34, the BER curves of the optimal non-coherent BFSK demodulator are given for different $L(\Delta f)$ varying from $-\infty$ dBC/Hz (continuous line) to -90 dBC/Hz (dash-dotted line). A negligible degradation can be observed for $L(\Delta f) < 110$ dBC/Hz.

Similar simulations have been achieved for an I&D detection method with $T_i = [10, 20, 40]$ ns, corresponding to transmission rates of 20, 10 and 5 Mp/s, respectively. The results are depicted in Fig. 3.35. We observe that the sensitivity against phase noise is reduced as the integration time increases. The dash-dotted curve ($T_i = 40$ ns) of
Fig. 3.33. Phase noise synthesis for high-level simulations by discrete reconstruction by the sum of uncorrelated sinusoids.

Fig. 3.35 begins to increase only around -100 dBc/Hz, while for optimum non-coherent detection (continuous line) the degradation occurs around -105 dBc/Hz.

The LO phase noise requirement is thus fixed to a value smaller than 105 dBc/Hz at an offset frequency $f_m = 150$ MHz. This quite relaxed value shows the relative insensitivity of a carrier-based IR-UWB direct conversion receiver to phase noise.

A summary of the implementation losses is provided in Table 3.7.

Table 3.7. Summary of the transceiver implementation losses

<table>
<thead>
<tr>
<th>TX</th>
<th>TX Nonflat spectrum $L_{nfs} = 2.34$ dB</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Non-ideal dithered modulation $L_{mod} \approx 3.3$ dB</td>
</tr>
<tr>
<td>RX</td>
<td>RX Front-end noise figure: NF &lt; 6 dB</td>
</tr>
<tr>
<td></td>
<td>I/Q mismatch: $\Delta \theta &lt; 10^\circ$ and $\epsilon_A &lt; 0.2$</td>
</tr>
<tr>
<td></td>
<td>Channel filter: $f_c \pm 10%$</td>
</tr>
<tr>
<td></td>
<td>LO Phase Noise: $\mathcal{L}(\Delta f) &lt; 105$ dBc/Hz</td>
</tr>
<tr>
<td></td>
<td>$L_{NF} &lt; 6$ dB</td>
</tr>
<tr>
<td></td>
<td>$L_{I/Q} &lt; 0.4$ dB</td>
</tr>
<tr>
<td></td>
<td>$L_{CHF} &lt; 0.2$ dB</td>
</tr>
<tr>
<td></td>
<td>$L_{LOPN} &lt; 0.1$ dB</td>
</tr>
</tbody>
</table>
3.11. Interferences and Linearity

3.11.1. BER Degradation in the Presence of Interferers

The robustness of a receiver is characterized by its ability to decode the received signal even in the presence of interferences other than AWGN or multipath, as considered earlier in Section 3.7.6 and 3.8, respectively.

Additional interferences are typically caused by other UWB transmitters (pulsed interferences) or originate from narrow-band radio devices (continuous wave single-tone interferences). The latter are the most critical sources of interferences due to the involved emission powers that can be as high as one watt (eg. GSM, UMTS, DECT...). An example of the detected electric field is an hospital environment is shown below in Fig. 3.36. This section focuses on the estimation of interferences that may be harmful in the frequency band of interest, from 3 to 5 GHz. Theses impairments can be caused simply by in-band jamming signals or by the conjugate effects of strong out-of-band signals and nonlinearities of the RF receiver front-end (see Section 3.11.3).

Figure 3.34. BER degradation caused by LO phase noise with optimum detection. The spot noise level in dBc/Hz has been calculated around an offset frequency $\Delta f = f_0 = 150$ MHz. No degradation is observed for $\mathcal{L}(\Delta f)$ smaller than 110 dBc/Hz.
Figure 3.35. Sensitivity degradation vs. LO phase noise \((f_m = 150 \text{ MHz})\) for different detection methods (optimal non-coherent FSK and I&D with different integrating times).

Figure 3.36. Overview of detected electric fields in the university hospital in Zurich from 800 MHz to 10 GHz measured with a log-periodic antenna (courtesy of Oliver Lauer, IfH, ETHZ, Zurich).
3.11. Interferences and Linearity

**CW Interfering Signal**

The first step is to evaluate the effect of an in-band interferer on the BER performance. A CW signal is simply added to the IR-UWB signal and its amplitude is increased until the receiver’s sensitivity is significantly degraded (typ. -3 dB). In order to obtain simulations results that are independent of the PRR, the signal-to-interference ratio (SIR) is expressed in terms of peak voltage amplitudes. Figure 3.37 shows that the amplitude of an in-band CW interferer must stay within one seventh (17 dB) of the peak IR-UWB voltage (round markers). The simulation is based on an integration window $T_i$ of 20 ns. To have a worst case scenario, the frequency of the CW interference has been chosen to maximize the degradation and corresponds to the center frequency of an RF IR-UWB pulse ($f_{\text{CW,interf}} = f_c \pm f_0$).

![Figure 3.37](image.png)

*Figure 3.37.* Simulated degradation of the required SNR for $10^{-2}$ BER (y-axis) caused by in-band CW and UWB interferences. The SIR is given here in terms of the peak voltage ratio $V_{\text{interf}}/V_{\text{UWB}}$ of the received signals and given for $T_i = 20$ ns.

**IR-UWB Interfering Signal**

The same procedure is applied to determine the sensitivity degradation caused by an in-band IR-UWB signal (single-user), having the same pulse period $R_p^{-1}$ of 100 ns but not synchronized with the received pulse stream. In this case the receiver accommodates a peak pulse amplitude...
as high as 4 dB below the signal of interest. These results are illustrated by the square markers of Fig. 3.37.

In summary, the simulations show that the SIR, in terms of voltage level with the nominal configuration ($R_p^{-1} = 100$ ns and $T_i = 20$ ns), must be larger than 17 dB and 4 dB for CW and unsynchronized IR-UWB interferences, respectively. These two values will help in the specification of the front-end linearity.

### 3.11.2. Potential Interfering Signals

An evaluation of a worst case scenario is presented to evaluate the influence of strong interfering signals. The potentially strongest sources of interference are the WLAN/WPAN radio services located on both sides of the KTI-UWB band and mobile phone services such as GSM and UMTS. These radio services are the most likely to operate in the vicinity of the proposed UWB transceiver and, consequently, build up a potential strong threat.

**Table 3.8. Principal potential interfering radio systems**

<table>
<thead>
<tr>
<th>Standard</th>
<th>Freq. range [GHz]</th>
<th>Band no.</th>
<th>Max. power [dBm]</th>
<th>PSF att. [dB]</th>
</tr>
</thead>
<tbody>
<tr>
<td>GSM900</td>
<td>0.89 - 0.96</td>
<td>1</td>
<td>33</td>
<td>50</td>
</tr>
<tr>
<td>GSM1800</td>
<td>1.71-1.785</td>
<td>2</td>
<td>30</td>
<td>40</td>
</tr>
<tr>
<td>DECT</td>
<td>1.88-1.897</td>
<td>3</td>
<td>24</td>
<td>40</td>
</tr>
<tr>
<td>UMTS</td>
<td>1.92-2.17</td>
<td>4</td>
<td>24</td>
<td>38</td>
</tr>
<tr>
<td>Bluetooth</td>
<td>2.4 - 2.485</td>
<td>5</td>
<td>0, 4, 20</td>
<td>21</td>
</tr>
<tr>
<td>802.11b</td>
<td>2.4 - 2.485</td>
<td>5</td>
<td>0 (EC), 30 (US)</td>
<td>21</td>
</tr>
<tr>
<td>802.11g/n</td>
<td>2.414 - 2.472</td>
<td>5</td>
<td>20</td>
<td>21</td>
</tr>
<tr>
<td>WiMAX</td>
<td>3.41 - 3.6</td>
<td></td>
<td>33</td>
<td>0</td>
</tr>
<tr>
<td>802.11a</td>
<td>5.15 - 5.25</td>
<td>6</td>
<td>16</td>
<td>32</td>
</tr>
<tr>
<td></td>
<td>5.25 - 5.35</td>
<td>6</td>
<td>23</td>
<td>36</td>
</tr>
<tr>
<td></td>
<td>5.725 - 5.825</td>
<td>7</td>
<td>29</td>
<td>30</td>
</tr>
</tbody>
</table>

### 3.11.3. Front-End Linearity Requirement

A study of the front-end linearity is required for a complete specification of the RF front-end. The effect of nonlinearities in high level simulations...
can be evaluated by introducing a memoryless polynomial block having
the following characteristic:

\[ y(t) = a_0 + a_1 x(t) + a_2 x(t)^2 + a_3 x(t)^3 + \ldots + a_n x(t)^n, \quad (3.87) \]

where \( x(t) \) and \( y(t) \) are the input and the output of the polynomial
block, respectively; \( a_i \) are the coefficients of the polynomial expression
and are constant. The frequency \( f_{IM[l,k]} \) of an intermodulation product
(IMP) generated by a nonlinearity having an order \( n \) are given by

\[ f_{IM[l,k]} = \pm k \cdot f_1 \pm l \cdot f_2, \quad (3.88) \]

where \( k + l = m \) and \( m = 2, 3, \ldots n \). Signs “±” are chosen such that
the resulting frequencies are positive. Both negative signs are consequently never used in the determination of IMP.

Investigations on the effect of nonlinearity usually focus on a deep
analysis of the second- and third-order IMP, whose main causes are
mismatch (offset) and limitation of the dynamic range (saturation), re-
spectively [84]. In a multi-band receiver, the frequencies \( f_{IM[l,k]} \) of IMP
may actually fall in the pass-band of the system and, at the very worst,
must be limited to a value which ensure a reliable communication.

The number of potential interferers associated with the IMP caused
by second- and third-order nonlinearity \((n= 2, 3)\) renders an exhaustive
analysis quite intricate. By selecting the seven out-of-band interferers
listed in Table 3.8 (see column entitled “Band no.”), the calculation
leads to 288 potential IMP. A MATLAB algorithm have been written to
investigate every combinations of IMPs and their amplitude. The goal
of this simulation is to determine the required front-end linearity to
enable a reliable communication in environments occupied by narrow-
band services. The assumption that have been made are listed hereafter:

- the interferers always emit at their maximum power and a 100%
  activity factor has been considered;

- the minimum distance between the interfering source and the IR-
  UWB receiver is 10 cm;

- the IR-UWB receiver antenna is a wide-band antenna with an
equivalent Q-factor of 2.
3.11.4. Second- and Third-Order IMP

Out-of-band CW Interferences

The maximum IMP level is determined with the help of Fig. 3.37 and Eq. 3.18. We assume a free-space propagation conditions and a received pulse energy of -180 dBJ at 10 m (Table 3.6). Under these conditions, the maximum IMP level is -83 dBm. The simulations show that, without any pre-select filter, the linearity requirements are $\text{IIP2} = 83 \text{ dBm}$ and $\text{IIP3} = 43 \text{ dBm}$. The use of a pre-select filter (PSF) in front of the LNA greatly reduces the linearity requirements. An example of typical attenuations of such a filter (see [85]) is given in the “PSF” column of the Table 3.8. With these out-of-band attenuations, the linearity can be reduced to $\text{IIP2} = 23 \text{ dBm}$ and $\text{IIP3} = -22 \text{ dBm}$. These values are easily obtainable with low-power RF front-ends. This clearly demonstrates the need of a pre-select filter for low-power IR-UWB applications. An exhaustive investigation of the IMP are presented in Fig. 3.38 and Fig. 3.39 for second- and third-order IMP, respectively. The thick black line shows the IMP level, whereas the gray shaded numbers represent the coefficients $k, l$ (Eq. 3.88) of the corresponding out-of-band interferers (rightmost column).

3.12. Summary and Conclusions

In this chapter, the entire specification for an impulse-radio UWB radio transceiver has been developed. Robustness, low-complexity and low-power are the key characteristics that have been kept in mind for the design of this transceiver. At the transmitter side, the first task was the investigation of a signalling scheme that fits the newly adopted UWB regulation masks; a signal metrics to characterize pulsed UWB signal has been first developed in Section 3.4. Then, a thorough investigation of a modulation scheme that allows a very simple but robust decoding has been proposed in Section 3.5. The modulation is based on a BFSK modulation used in conjunction with a carrier-less AM to generate a composite IR-UWB signal that meets the UWB regulation mask. This modulation has also been chosen for its robustness against multipath propagation and interferers (selectivity).

The overall receiver is based on a direct-conversion approach (Section 3.6) using signals in quadrature after the down-conversion. The composite RF signal is downconverted to baseband in a $\pm 200 \text{ MHz}$ bandwidth channel around DC. The demodulation of this baseband
3.12. Summary and Conclusions

signal employs the quadricorrelation principle, which introduces practically no losses and provides, in AWGN channels, the BER performance of an ideal non-coherent demodulation scheme. The signal detection employs an integrated-and-dump method. The benefit of this detection schemes resides in its simplicity and it robustness. Furthermore, we will demonstrate in Section 7.3 that an important advantage appears at the signal synchronization. The performance of this demodulation scenario has also been investigated in dense multipath channel. The integration time for different indoor channels shows an optimum between 20 and 40 ns (see Section 3.8). The effect of ISI has also been studied. The most reliable pulse rate for an ISI-free communication is 5 MHz, whereas 10 to 20 MHz can be used with a minimum degradation except for channel model CM8 (indoor industrial, see Table 3.5).

From an implementation point of view, several aspects have been considered. First, the effect of mismatch in the I and Q path shows very little influence, mainly owing to the nature of the signals (short pulse) and also because the chosen topology does not require any image rejection (both sides around the LO are used). Therefore, no calibration is required at the receiver (Section 3.10.1). In Section 3.10.4, the impact of LO phase noise is investigated with the help of simulations. It can be shown that relatively low quality oscillator can be used without significantly affecting the overall BER. A phase noise of up to -110 dBc/Hz at 150 MHz offset can be used, which will enable wideband ring oscillators for the LO generation. Furthermore, the frequency offset that can be tolerated between the receiver and the transmitter is $\pm 20$ MHz. Measurements of BER curves using a simple AD/DA-calibrated free-running ring oscillator will be shown later in Section 7.7. Finally, the front-end linearity has also been extensively studied in Section 3.11. When using a typical commercial preselect filter, the front-end linearity requirement is typically $\text{IIP}2 > 23$ dBm and $\text{IIP}3 > -22$ dBm, which are easily obtainable without calibration and without excessive power consumption.

This section showed the advantages of the proposed solution; it focused especially on “real-world” conditions and the severe constraints required for a low-power application. For instance, the entire transceiver can be used with a relaxed LO phase noise and a reduced front-end linearity. The absence of accurate calibration, which usually requires long overhead times, will help to reduce the overall complexity and the power consumption, without significant loss in sensitivity.
Figure 3.38. Harmonic distortions ("coefficient=2") and second-order intermodulation products for IIP2=+23 dBm (with preselect filter, see Table 3.8). The gray-shaded numbers on the top of the graph represents the coefficients for the out-of-band interferers listed on the right. The number in parenthesis represents the maximum signal power at the input of the LNA for a minimum distance of 0.1 m and with preselect filter attenuation given in Table 3.8.
Figure 3.39. Second-order intermodulation products for IIP3=-22 dBm (with preselect filter, see Table 3.8. The thick black lines represent IMP levels for duo-tone intermodulation and are used here to determine the worst case IIP3, whereas the thick gray lines are originated by three-tones interferences scenarios and are consequently less likely; however, an IIP3 better than -12 dBm is sufficient to bring the three-tone IMP’s below -83 dBm.)
4

Fully-integrated CMOS IR-UWB Transmitter

4.1. Issues in IR-UWB Signal Generation

Several issues have to be considered in the design of a carrier-based IR-UWB transmitter. First, the signal has to cover a large bandwidth. In our topology, the bandwidth extends up to 1.5 GHz from 3.3 to 4.8 GHz. These large fractional bandwidths preclude the use of classical narrowband techniques benefiting from resonance and large quality factors. Associated with the wideband characteristic, the flatness that is required all along the pass-band of the transmitter is less of concern, since a multi-band transmission principle is applied and signal strength can be adjusted for each band. The inband flatness is rather dictated by the fact that the transmitter must be able to send signals as close as possible to the maximum allowed limit to maximize the link margin (see Section 3.9). The second main concern is the spectrum characteristic of the generated RF signal, and more particularly, the out-of-band emission, which has to fit the considered regulation masks (FCC or ECC). A last issue, which mostly depends on the hardware implementation, concerns the unwanted emission of parasitic spurs, such as clocks or local oscillators. The periodic nature of these CW signals may become critical at low RF duty cycles or low pulse rates. These spurious emissions have energy concentrated in a few Hz and are incompatible with the tight emission masks of the regulations. Another important point is the design effort. It is highly desirable to apply hardware reuse to minimize the time required to design a system such
as a whole transceiver. For instance, building blocks such as the RF carrier generation (eg. PLL) or the baseband filter will be designed to be employed for both the receiver and the transmitter.

4.2. Transmitter (Tx) Architectures

In this work, several architectures have been investigated. The simplest topology, which is presented in [86] (Appendix B), is an open-loop switched oscillator without any frequency control. Unfortunately, this kind of transmitter lacks frequency accuracy when used for example with the bandpass-shaped European regulation mask. However, it has served as a first approach for the implementation of a more accurate IR-UWB pulse generator.

Another completely different topology has also been initially investigated and is described in [87] (Appendix C). This circuit investigates pulse shaping directly at RF and is employed as an output on-chip filter with an integrated balanced-to-unbalanced (“balun”) function.

The final solution developed in this Chapter consists in a PLL-based topology including a baseband spectrum-shaping filter. This transmitter generates IR-UWB pulses that are sufficiently accurate in terms of center frequency and shape to meet the requirements targeted in the previous Chapter.

4.2.1. Up-conversion Quadrature Tx Architectures

From an architectural point of view, up-conversion topologies are the most popular solutions for RF transmitters. They concern more than two-third of the reported solutions over the 10 last years [88]. Typically, the up-conversion of the baseband signal can be done in one (Fig. 4.1-a) or two steps (Fig. 4.1-b). The chosen topology mainly depends on the degree of spectral purity and the resistance against interferers (“frequency plan”) required by the standard or the signalling scheme. Almost all solutions are based on quadrature signal generation, since the modulation schemes are based on M-ary solutions, where M is larger than two. Contrary to these transceivers, mainly intended for narrowband signalling, IR-UWB is mostly based on PPM, BPSK or, in our case, on a proprietary pulsed BFSK scheme with a modulation index $M = 2$. Consequently, this scheme does not require any quadrature topology and direct modulation can be employed.
4.2. Transmitter (Tx) Architectures

4.2.2. Proposed Transmitter: Direct Modulation

According to Section 3.5, the proposed transmitter must be designed to generate IR-UWB pulses that are modulated by a BFSK scheme. Such a modulation can be based on the direct generation of a carrier-frequency, which corresponds to the center frequency of the RF pulse. With minor additional efforts, the transmitter makes it possible to use other modulation schemes for IR-UWB, such as binary phase shift keying (BPSK) and pulse position modulation (PPM) or a combination thereof. For instance, the combination of PPM and BPSK modulations was the choice of the IEEE 802.15.4a Committee, which has proposed an alternate physical layer using the IR-UWB principle [89].

A block schematic of the proposed IR-UWB transmitter is depicted in Fig. 4.2. The light-gray area corresponds to the baseband section, which is responsible for the pulse envelope generation. It consists of an
Figure 4.2. Block schematic of the proposed IR-UWB transmitter chip.

input stage for signal conversion to an internal differential format and a shaping filter. The input $V_{\text{in}}$ is a digital single-ended signal fed from an external microcontroller or FPGA. This signal is converted into a differential signal for better immunity to supply noise and low-pass filtered ($f_{\text{LP}}$) to reduce spurious side-lobe in the frequency domain after up-conversion. The RF back-end section is depicted by the dark-gray region. It includes the FSK PLL (Chapter 5), whose output frequency is given by $f_{\text{PLL}} = f_{\text{ref}} \cdot N(W_{\text{div}})$, where $N$ is the division ratio fixed by the digital word $W_{\text{div}}$ and $f_{\text{ref}}$ is the reference frequency. The PLL division ratio $N$ is fixed accordingly to both the transmitted data and the desired channel for multi-channel or frequency-hopping configuration. The RF section also includes the modulator responsible for the up-conversion of baseband pulses and the AM/BPSK modulation. The RF signal is then passed through a power amplifier (PA) stage. Both modulator and PA can be disabled between the pulse emission to reduce the overall power consumption.
4.2. Transmitter (Tx) Architectures

The choice of such an architecture has been dictated by the following reasons: 1) as mentioned in the previous section, most IR-UWB modulations do not require quadrature schemes; 2) the recently adopted masks for UWB indoor communication actually have a bandpass characteristic (see Chapter 2, Fig. 2.2). More particularly, the emission of signals fitting the ECC mask (see Section 2.3.5), requires up-conversion a tight filtering techniques. Moreover, the degree of accuracy in the RF signal’s center frequency still requires the use of at least a frequency control method. Several methods exists in the literature, such as digitally-controlled PLL [90], or phase-aligned FLL [91]. In this work, a frequency calibration for a free-running VCO using a PLL and analog-to-digital conversion has also been investigated [92] and will be described later in Appendix D.

As a final advantage for direct BFSK modulation, we mention the fact that this scheme avoid the concentration of signal energy in a particular frequency. This can be particularly problematic if excessive leakage occurs in the frequency up-conversion (e.g. the LO signal may leak through the modulator to the antenna) or if the transmitter integrated circuit itself radiates some RF energy. Leakage of periodic RF signals with a constant frequency causes spectral lines in the spectrum of the emitted signal. These spurious tones may not fit the low spectral density requirements of the UWB masks. Moreover, this phenomenon may particularly be harmful at low pulse rates or at very low output duty cycles. Directly modulating the pulse’s carrier frequency thus avoids the accumulation of energy around a single frequency. To overcome this issue and to enable very clean IR-UWB spectra, the switchable output stage further helps to reduce any leakage of the LO through the antenna.

4.2.3. Signalling Schemes

As depicted in the dashed rectangles of Fig. 4.2, this type of transmitter architecture is very flexible and enables the generation of several carrier-based UWB signalling schemes.

**LRPM:** stands for *low-rate pulsed modulation* and is the main scheme considered all along this thesis. Isolated short rectangular pulses with length $T_P$ are generated by the digital baseband

---

1This solution only controls the frequency, the coherence of the phase for BPSK is ensured by a baseband triggering signal and limited in time over 30 ns.
section at a typical PRR between 1 and 10 MHz. The rectangular pulses are filtered to reduce the sidelobes and up-converted to RF. BFSK modulation is applied here at the up-conversion. Phase modulation can also be employed for spectrum dithering.

**HRPM:** an alternate scheme is high-rate pulsed modulation, where the baseband input is fed by a digital periodic signal, whose frequency is half the PRR. The edges of the baseband waveform are smoothed by the pulse shaping filter. The resulting signal is a sinus-like baseband signal, which is up-converted to RF. The sinus shape of the resulting envelope reduces the sidelobes. Each pulse can be directly phase modulated at the RF stage by the phase modulator. This signalling scheme resembles that adopted by the IEEE802.15.4a Standard [89]. The only difference is that the proposed implementation cannot generate the ternary modulation required by the Standard (see Section 4.4.2 on “Polarity Modulation”).

**PMCW:** The last scheme consists in the generation of phase modulated continuous wave signals. In this case the phase modulation is directly applied on the CW RF signal, whose envelope remains unshaped and constant. This kind of modulation also features UWB properties if the modulation rate is larger than 500 MHz. Applying the phase modulation directly on the RF signal, as proposed in this implementation, enables very fast modulation rates up to 1 GHz. Bursts of PMCW signal can be generated by switching the PA stage on and off (\(EN_{PA}\) signal).

### 4.3. Analog Pulse Shaping for LRPM

#### 4.3.1. Introduction and Principle

Since single Gaussian pulses are not easy to synthesize, the practical implementation of the proposed transmitter leads to an IR-UWB signal having a slightly different characteristic than the one used in the theoretical part of this work (Section 3.4). Typical signals that are available to generate the envelope of the IR-UWB pulses may be provided by a digital device (micro-controller or a FPGA). Rectangular signals cannot be directly used for the pulse envelope. Their excessive side-lobes, after frequency up-conversion, may fall out of the spectrum mask or appear as a spurious emission in an adjacent channel. The input signal
must consequently be low-pass filtered. However, the duration $T_P$ of a baseband pulse can be well defined, since it can be derived from an accurate low-frequency reference clock.

We first consider the PSD $|G(f)|^2$ of a rectangular function of duration $T_P$ (continuous curves of Fig. 4.3). The PSD of such a signal is a squared sinc function, whose zeroes are at multiples of $1/T_P$:

$$|G(f)|^2 = (A_P \cdot T_P)^2 \left[ \frac{\sin(\pi f \cdot T_P)}{\pi f \cdot T_P} \right]^2,$$

(4.1)

where $A_P$ is the pulse amplitude. By using a pulse length $T_P$ that corresponds to half the period of the frequency offset of the BFSK modulation, i.e.

$$T_P = (f_o/2)^{-1},$$

(4.2)

we obtain a spectrum with transmission zeroes at multiples of $f_o/2$ and a -10 dB magnitude on the main lobe located at frequency

$$f_{-10dB} \approx 1.107 \cdot f_o/2.$$

(4.3)
After up-conversion and frequency modulation with an offset frequency $f_o$, the following total bandwidth $BW$ for the composite IR-UWB signal is obtained by

$$BW = f_o + 2 \cdot f_{-10dB} \approx f_o + 2 \cdot (1.107 \cdot f_o/2). \quad (4.4)$$

Dotted and dashed curves of Fig. 4.3 illustrates the principle of IR-UWB composite signal generation from a rectangular pulse. To enable a multi-channel configuration and minimize adjacent channel overlap, undesired sidelobes associated with the rectangular shape of the digital input signal $V_{in}$ has to be reduced with an additional low-pass filter ($f_{LP}$). This low-pass transfer function is schematized by the gray shaded area in Fig. 4.3. A numerical application valid under both FCC and ECC regulation is detailed in the next Section 4.3.2.

An interesting aspect comes from Eq. 4.2; the inverse of the baseband pulse length $1/T_P$ and the offset frequency $f_o$ are multiple of each other and thus can be simply derived from the same clock signal. The zero at $1/T_P$ is exactly located at the receiver’s LO frequency (see “LO” marker on the horizontal axis of spectrum plot in Fig. 4.3). This configuration avoids DC-offset issues at the baseband after direct down-conversion. It is interesting to note that these properties are independent of the pulse repetition rate. A second advantage comes with the zero at $2/T_P$ that falls on the maximum amplitude of the pulse located at the offset frequency $f_0$ (dotted spectrum), thus relaxing the filter order for side-lobes reduction.

### 4.3.2. Tx Parameters under FCC and ECC Regulations

With the proposed modulation, it has been possible to find a set of parameters $f_{ref}$, $N(W_{div})$ and $f_{LP}$ that fits both the ECC and the FCC masks only with a change in the division ratio $N$ of the PLL defined by the digital word $W_{div}$. The PLL is of N-integer type with half-integer division steps and is described in details in the next Chapter. By choosing a PLL reference frequency $f_{ref} = 295$ MHz, the offset frequency $f_o$ for the BFSK modulation has the same value. According to Eq. 4.4, this results in a composite signal bandwidth of approximately 620 MHz. Under FCC regulation between 3 and 5 GHz, three channels A, B and C occupied with composite BFSK IR-UWB signals can be generated. The corresponding division ratios for the six carrier frequencies are 11,12,...16, respectively. Under the European regulation, bandwidth for only two such channels is available. In this case,
the four carrier frequencies are set by division ratios fixed at 12.5/13.5 for channel AA and 14.5/15.5 for channel BB. Furthermore, choosing a corner frequency $f_{\text{LP}} = 160$ MHz and a 4th-order filter reduces sidelobes to fit regulation masks and leads to a composite BFSK signal bandwidth in excess of 500 MHz. Simulations of resulting spectra are shown in Fig. 4.4 and 4.5. With this set of parameters, the emitted signal theoretically fits the stringent ECC regulation mask, and more particularly the minimum out-of-band attenuation of 43.7 dB that is needed at the mask transition at 3.4 GHz (see Section 2.3.5).

4.4. The Modulator and the Output Stage

4.4.1. Architecture Overview and Specifications

Figure 4.6 shows the architecture of the RF section of the transmitter. It contains a modulation section (two top areas) and a power amplification section (bottom area). The modulation section consists of the amplitude modulation (AM) and the polarity modulation (PM) circuits for binary phase shift keying (BPSK) and the frequency up-conversion section, whose differential input $V_{\text{PLL}+}/V_{\text{PLL}-}$ is connected to the VCO (not shown, see Section 5.3).

The PA section uses dynamic biasing for power consumption reduction. During pulse emission, an instantaneous bias current $i_{\text{AM}}$ for the PA is derived from the baseband envelope signal $V_{\text{env}}$. The latter signal changes the bias configuration of the final stage of the PA from class-AB to class-A. Keeping the same topology in the modulator (M1-M4) and the PA bias circuit (M7-M10) allows for precise alignment of the bias current with the RF signal envelope. Current reuse has been implemented between the source follower pre-amplifier (M11, M12) and the power transistors of the final stage (M13, M14).

4.4.2. The AM/PM Modulation Section

Both signal and bias modulation sections are based on pseudo differential circuits formed by transistors M1-M2 and M7-M8 and degenerated by R1-R2 and R7-R8, respectively (Fig. 4.6), to increase dynamic range.
**Figure 4.4.** Three-channel configuration for BFSK modulation within FCC lower band (the dashed line represents the corresponding regulation mask).

**Figure 4.5.** Optimum two-channel configuration for BFSK modulation within ECC lower band (the dashed line represents the regulation mask). Note also that the ECC mask used here is an earlier version than the final one described in Chap. 2.
4.4. The Modulator and the Output Stage

Figure 4.6. Simplified schematic of the transmitter RF section.

Amplitude Modulation Circuit

The AM signal is directly generated by the input envelope signal $V_{env}$ coming from the pulse shaping filter. This signal is applied at the input of a PMOS degenerated differential pair (M1-M2). This pair makes up the transconductance input stage of a folded Gilbert cell up-conversion mixer. The folding operation and the up-conversion stage are realized by current mirrors M3-M6 and the cross-coupled transistors Mcc, respectively. The pseudo-differential topology helps to increase the linearity of the circuit owing to the reduced gain compression effect of the common-emitter stage formed by M1-M2, as shown in [93]. The main
drawback of pseudo-differential topologies is the lack of common-mode rejection owing to the absence of a tail transistor with high output impedance (transistors Mb are switches employed to disable the entire circuit between pulse transmission). Such a topology thus requires a well controlled common-mode at its input. This condition is actually met for $V_{\text{env}}$, since this signal is provided by a fully-differential buffer with a common-mode regulation at the output of the pulse shaping filter (see Section 4.5).

**Polarity Modulation Circuit**

The modulated differential current $i_{\text{AM}}$ is mirrored to the up-conversion double-balanced mixer through a polarity inversion circuit to achieve BPSK modulation. This circuit is realized with analog transmission gate (TG) 2:1 multiplexers. This topology enables fast polarity changes in less than 500 ps.

Although the voltage dynamic range of this current mirror topology is not critical, TG multiplexer have been used because of the simplicity of the implementation. In this realization, only a binary phase inversion has been implemented. The ternary (0+-) modulation required by the IEEE 802.15.4a during preamble [89] can be easily implemented with a 3:1 multiplexer, whose third input is connected to the diode connected transistor of an unmodulated replica of the AM-modulation circuit.

### 4.4.3. Power Amplifier Section

**Description**

IR-UWB signals do not require any power between emission periods. This allows switching off the power consuming stages between pulse emissions for significantly power savings. For this purpose, a control signal $EN_{\text{PA}}$ (dashed line in the time diagram of Fig. 4.6) from the bias control circuit can completely disable the output section.

In order to not significantly distort the transmitted pulse, class-A or class-AB amplifier will be used in this application. However, these topologies still require large bias currents during pulse emission. The bias current is fixed by the maximum current swing at the load $R_L$ (antenna). As a consequence for pulsed signals, the amplifier is most of the time over-biased and wastes power during pulse transients.

Significant power can be further saved by dynamically varying the bias of the output stage accordingly to the pulse envelope. Such a so-
ution has been for example proposed by Saleh and Cox in [94] and is also know as “class-\(\bar{A}\)”. This principle has been applied here to a pseudo-differential class-AB stage, as shown in Fig. 4.7. The variable bias \(V_{b,dyn}\) is the sum of a constant bias \(V_{b,cst}\) and a variable bias \(V_{b,env}\). The constant bias voltage \(V_{b,cst}\) is enabled by \(EN_{PA}\) and sets the operating point of the output transistors \(M_{13}\) and \(M_{14}\) in class-AB. As shown in Fig. 4.6, \(V_{b,env}\) is derived directly from the input pulse envelope \(V_{env}\). A similar input circuit (\(M_7-M_{10}\), and MUX) than used for the AM/PM modulation has been employed to ensure a matched alignment between the dynamic bias and the input RF signal \(V_{rf,in}\).

![Figure 4.7. Principle of pseudo-differential class-AB output stage with dynamic bias.](image)

The size of the output transistors \(M_{13}\) and \(M_{14}\) is large (\(W_{13} = W_{14} = 240\mu m\)). They can not be directly driven by the output of the Gilbert-cell (up-conversion). An additional source follower stage (\(M_{11}, M_{12}\)) is thus inserted between the Gilbert-cell and the final stage. To further reduce power consumption, the bias of this stage follows the same principle. Current-reuse is employed by placing the source follower stage at the top of the final stage.

**Post-layout Simulations**

This circuit has not been separately integrated and could not be completely characterized. We however demonstrate its performance by means of post-layout simulations. We first illustrate the power consumption profile of the output stage in the time domain in Fig. 4.8. In this example, running at 10 Mp/s, a differential input LO amplitude
of 600 mVpp and a peak voltage 500 mV for the envelope $V_{env}$ are assumed. The enable signal $EN_{PA}$ has a duty-cycle of 20% (20 ns). With a supply voltage of 1.8 V, the peak power consumption of the output stage is estimated to be 20 mW, whereas the average power consumption is reduced down to 1.54 mW owing to the dynamic bias scheme. The RF pulse amplitude at a differential load of 100 $\Omega$ is 1 Vpp. The overall power consumption of the entire RF section, as shown in Fig. 4.6, has been simulated at 3.7 mW.

![Figure 4.8. a) Simulated average TX power consumption breakdown (PA: power amplifier, MOD: modulation and up-conversion) with 20% duty-cycle scheme at 10 M$\text{p}$/s. b) Corresponding generated RF pulses and $EN_{PA}$ signal.](image)

A plot of the normalized output spectrum obtained with an ideal Gaussian pulse envelope at the input of the demodulator is given in Fig. 4.9 for three different frequencies between 3.3 GHz and 4.8 GHz. Gain flatness of the entire RF section of the transmitter is also shown.

### 4.5. Pulse Shaping Filter

#### 4.5.1. Introduction

The pulse shaping filter used for the generation of the short IR-UWB pulses is realized by a continuous-time filter based on transconductance devices and integrated capacitors only [95]. This implementation has been preferred over a digital implementation owing to its lower power consumption. A digital implementation of short pulsed signals of a few nanoseconds with well controlled envelope would require more than
Figure 4.9. Post-layout simulation of the transmitter’s RF section (modulator and power amplifier) for three different frequencies. The three output spectra are normalized to the maximum value to highlight gain flatness over the entire bandwidth. The latter varies between -5 dB to 0 dB and shows a slightly growing characteristic to compensate for increasing losses toward high frequencies (test board parasitics). Distortion analysis can be performed by comparing the output spectrum with the input signal (here a Gaussian signal is assumed, dashed lines). Small spectral regrowth is observed below -35 dBC. Third order harmonics above 10 GHz can be easily filtered out by the antenna transfer function.
four times oversampling techniques, leading to a reference clock in the range of GHz frequencies. Furthermore, more than 5 bits per sample is required to obtain very low out-of-band signal components smaller than -50 dBc. This resolution increases the complexity of the pulse shaping filter [96] and power consumption.

The requirement for the shaping filter used in LRPM and HRPM (Section 4.2.3) is a 4th-order low-pass Chebyshev filter with 0.5-dB ripple. These values have been obtained by adjusting the transmitted spectrum to both FCC and ECC regulation masks. Although a minimum order of four has been specified in Section 4.3.2 (p. 108) for the transmission of IR-UWB pulses, a 5th-order is proposed here to allow reuse of the the shaping filter in the receiver for channel filtering. A simple modification of the cut-off frequency will adapt the characteristic for the Tx and the Rx mode.

4.5.2. Implementation

The equivalent schematic of the shaping filter is given in Fig. 4.10. Sub-figure a) depicts the prototype filter of the 0.5-dB ripple 5th-order Chebyshev filter with cut-off frequency of 170 MHz. The filter synthesis starts with the correct selection of the equivalent L and C values. The classical filter theory gives

\[ C_1 = \frac{a_1}{2\pi f_0 R_0}, \]
\[ L_2 = \frac{a_2 R_0}{2\pi f_0}, \]
\[ C_3 = \frac{a_3}{2\pi f_0 R_0}, \]
\[ L_4 = \frac{a_2 R_0}{2\pi f_0}, \]
\[ C_5 = \frac{a_1}{2\pi f_0 R_0}, \]

2The requirements in terms of out-of-band attenuation for the Rx channel filter are tighter (≥ 5th-order)
Figure 4.10. Pulse shaping filter. a) LC-ladder prototype Chebyshev filter, b) block schematic with capacitors, OTA and gyrators derived from a), c) detailed electrical circuit of a transconductance cell, d) gyrator equivalent block schematic.
where \( a_1 = 1.8069, a_2 = 1.3025 \) and \( a_3 = 2.6915 \); \( f_0 \) is the cutoff frequency and \( R_0 \) is the load and source resistance of the LC-ladder filter, i.e. \( R_0 = R_S = R_L \).

In the case of low-pass LC ladder, inductors can be directly synthesized by a gyator-capacitor combination, as shown in Fig. 4.10-b). This is possible since a shunt capacitor naturally appears on both the input and the output ports of each gyrator [97]. The parasitic capacitance thus forms a part of functional capacitances \( C_2 \) and \( C_4 \). The same observation applies for capacitors \( C_1, C_3 \) and \( C_5 \) with the transconductance devices.

The transconductance \( g_m \) is implemented with an operational transconductance amplifier (OTA). Non-ideal OTAs introduce mainly two type of degradations in the transfer function when used as an integrator [95]. First, the DC gain of an integrator is non-infinite. This is caused by the non-zero output conductance \( g_0 \) of the OTA. Assuming an output load capacitance \( C_L \) for the OTA, the low-frequency dominant pole is shifted from the origin \( \omega_d = 0 \) to \( \omega_d = g_0/C_L \), below which the DC transfer function is pushed down from infinite (\( g_0 = 0 \)) to \( g_m/g_0 \). The equivalent effect on the transfer function of the integrator is to introduce a phase lead at low frequencies. The second source of error is the non-dominant pole \( \omega_{nd} \) at high frequencies; the latter introduces a phase excess in the transfer function. From simulations of the prototype filter topology with non-ideal OTA, it can be shown that in order to not significantly degrade the transfer function of a Chebyshev filter, the maximum amount of phase lead and excess must correspond to a dominant and non-dominant pole, which are at least several orders of magnitude smaller and larger than the unity gain frequency \( f_0 \) of the integrator. An illustration of this phenomenon is given in Fig. 4.11. In sub-figure a), both transconductances of an ideal and non-ideal OTA are depicted. The non-ideal OTA exhibits a dominant and a non-dominant pole at 5.3 MHz and 4.8 GHz, respectively, which corresponds to \( f_d < f_0/30 \) and \( f_{nd} > 30 \cdot f_0 \), respectively. The degradation of the transfer function of the shaping filter is given in sub-figure b). The main effect of the non-ideal OTA is a slight increase of the ripple from 0.5 dB to 1.2 dB, which is acceptable for our application. Negligible degradation is observed on the cut-off frequency and in the stop-band.
4.5. Pulse Shaping Filter

![Figure 4.11. Influence of a non-ideal OTA (a) on the transfer function of the pulse shaping filter (b).](image)

**Automatic Tuning Control**

Since the goal is mainly to remove sidelobes in the spectrum of the digital input signal, the accuracy of the cut-off frequency can be relaxed to approximately ±10%. To reach this accuracy, there is actually no need to compensate for variations of both transconductance $g_m$ and capacitors. In this filter implementation, only the transconductance $g_m$ will be servoed. The calibration of $g_m$ will simply rely on the value of an external resistor $R_r$ and thus avoid the use of a PLL. It is derived from the work of Pavan [98]. The $g_m$-calibration circuit is shown in Fig. 4.12.

The circuit results in a transconductance that accurately tracks the conductance $1/R_r$ of an external resistor. The principle of is explained hereafter. For a small $\Delta V_r$, the current difference at the output of the OTA is

$$I_1 - I_2 = \Delta I = g_m(2I_b) \cdot \Delta V_r, \quad (4.6)$$

where $g_m(I)$ is the transconductance to track, which is a function of
the tail current $2I_b$. The equation governing the voltage at the reference resistor is

$$I_r \cdot R_r = \Delta V_r.$$  \hspace{1cm} (4.7)

Combining both previous equation, we obtain

$$\Delta I = g_m(I)I_rR_r.$$  \hspace{1cm} (4.8)

By forcing $I_r$ to be equal to $\Delta I$, we get \(g_m(I) = 1/R_r\). Practically, the gate voltage $V_b$ of the tail current of the transconductance is adjusted by a differential transresistance device (DTR), which build the difference operation of a negative feedback loop. If the tail’s gate voltage $V_b$ is too low, $g_m$ is smaller than $1/R_r$ and $\Delta I$ is smaller than $I_r$, thus, a positive current is entering the DTR, which raises the gate voltage $V_b$. A steady-state operation point is reached for $\Delta I = I_r$.

Post-layout Monte-Carlo simulations of the -3 dB cutoff frequency are given in Fig. 4.13 for $R_r = 1.68 \ \text{k}\Omega$. The nominal cutoff frequency is 170 MHz. The statistical distribution over process variations and mismatch exhibits a $3\sigma$ variation (99.8%) of 22.5 MHz\(^3\), which corresponds to a spread of $\pm 13\%$. This is a slightly more than the targeted

\(^3\)With process only variations, the $3\sigma$ spread is approximately 15 MHz
10% but corresponds to the typical variation of the MIM-capacitors (±12%). It is however less than half the ±50 MHz simulated without any \( g_m \)-stabilization circuit.

![Histogram of cutoff frequency with mean \( \mu = 170.7 \text{ MHz} \) and standard deviation \( \sigma = 7.5 \text{ MHz} \).]

**Figure 4.13.** Monte-Carlo simulations of the cutoff frequency.

### 4.5.3. Experimental Results

A integrated circuit has been separately implemented for test purposes. The transfer function has been measured with differential S-parameters and compared with a nominal simulation (no process variation and mismatch). Figure 4.14 shows the results. Measured (plain) and simulated (dashed) curves exhibit an excellent match. An inset figure zooms around the corner frequency. On the measured sample chip, the corner frequency differs by less than 3 MHz. The high pass characteristic comes from DC-blocking capacitors used for the measurement on the test PCB. The current consumption of the filter is 1.6 mA for the filter core and 0.3 mA for the \( g_m \)-stabilization circuit. The bias current of a single \( g_m \)-cell is approximately 100 \( \mu \text{A} \).

### 4.6. Tx Measurements

#### 4.6.1. BFSK IR-UWB vs. FCC Regulation

The transmitter has originally been designed to generate IR-UWB pulses, whose -10 dB bandwidth is approximately 200 MHz and center frequency \( f_c \) is modulated by a BFSK scheme with an offset modulation frequency of \( f_0 = \pm 150 \text{ MHz} \) (Section 4.2.3). This results in a composite 500 MHz IR-UWB signal, which can be easily processed by direct conversion and non-coherent detection schemes (Chap. 3).
Figures 4.15-a/b report the measured PSDs obtained when applying this scheme with center frequencies \( f_c \) located at 3.45, 4.05 and 4.65 GHz. The PSD are depicted in transparency to emphasize the spectrum overlap between channels. The maximum power from the adjacent channels occurs for channel B and is approximately 30 dB below the inband power. This small channel crosstalk comes from the spectral roll-off, which has been measured at values higher than 70 dB/GHz. Figure 4.15-b is a zoomed version between 2.5 and 6 GHz and indicates the measured \(-10\) dB bandwidth of each channel. The dash-dotted lines shows the LO frequencies for each of the considered channels. The offset frequency modulation \( \pm f_0 \) follows a pseudo-random sequence. Except the PLL reference frequency at 300 MHz, the measured PSDs are absolutely clean from any parasitic spurs. Note that for a fair comparison, the PSDs are shown without any normalization or fitting to the regulation masks. We observe that the overall output PSD suffers from a low-pass effect: channel C’s overall emitted power being more than 10 dB below channel A. This is probably due to a higher than expected parasitics at the output stage and/or a increased loss of the carrier toward higher frequencies. Note that no pulse amplitude regulation has been implemented in this prototype. An issue in output peak
4.6. Tx Measurements

PSD calibration comes from the fact that the IR-UWB transmitted signal is meant to have noise-like properties. The FCC requires that the PSD be measured using a swept-tuned spectrum analyzer with a specified RBW, an RMS detector and an equivalent antenna gain of 0 dBi (an isotropic antenna is assumed). The implementation of such a technique on-chip is impossible, especially if one wants to obtain an absolute calibration of the peak PSD. A way to alleviate this issue is the implementation of a closed-loop calibration of the entire transceiver. Assuming that the receiver front-end can be made relatively flat over the bandwidth of interest, such a solution would enable a calibration of the emitted power to obtain a relatively flat in-band characteristic by monitoring the incoming signal with the help of the control voltage of the receiver’s automatic gain control (AGC). Some margin in the transmitted power has however to be left to compensate for variation is absolute peak PSD values (PVT on receiver gain). The accurate calibration of the output PSD in IR-UWB to maximize the output power is still an issue. The discontinuity at the noise floor (-80 dBm/MHz) in Fig. 4.15-a at 3 GHz is simply caused by the measuring instrument (spectrum analyzer Rhode&Schwarz FS-30), which features a sensitivity offset at this frequency. According to the FCC regulation, the PSDs have been measured with a spectrum analyzer equipped with a true average RMS power detector and set to a resolution bandwidth of $RBW = 1$ MHz. Figures 4.16-a/b/c report screen copies of a high frequency digital sampling scope (DSO “Infiniium” 20 Gsa/s scope from Agilent) and show the composite signals obtained in the three IR-UWB BFSK channels. The superimposition of the frequency shifted pulses shows that a good balance between the two pulse amplitudes in each channel is obtained.

4.6.2. BFSK IR-UWB vs. ECC Regulation

In this section, the compliance with the European UWB regulation (see Chap. 2) has also been investigated. The transmitter has been configured such that the generated pulses fit the ECC mask. Particularly, the PLL reference clock $f_{clk}$ has been set to 290 MHz and the four last integer division ratios, i.e. 13 to 16, have been used to generate signal carriers for the frequency modulation. The ECC masks thus accommodates two channels with the proposed modulation. The channels have been named “BC” and “CC” and are centered at 3.915 and 4.495 GHz, respectively.
Figure 4.15. Measured PSDs of the IR-UWB BFSK signal a) from 0.2 to 11 GHz and b) centered around channels (RBW=VBW=1 MHz). For a fair comparison, the PSD’s are shown without any normalization to the spectrum mask.
Figure 4.16. Measured output RF voltage of the composite IR-UWB BFSK signals for the three channels. The ringings beyond 12 ns come from the low-pass shaping filter and help to reduce the side-lobes in frequency domain to fit the regulations masks. Keeping only one ringing after the main pulse does significantly affect the output PSD. Consequently, the minimum active period of the output stage can be set to approximately 20 ns (20% duty-cycle at 10 Mp/s).
Figure 4.17. Measured PSD of the IR-UWB signal against ECC spectrum mask, a) from 0.2 to 11 GHz and b) centered around channels (both with 7 dB margin reduction applied to fit the mask). Note also that the ECC mask used here is an earlier version than the final one described in Chap. 2.
The experimental data shown in Fig. 4.17 have been normalized to -41.3 dBm/MHz. A corresponding amplitude reduction of 7 dB, which is equivalent to the amplitude margin, has been applied on the spectrum to fit this level. As observed around and below 3.4 GHz, the spectrum does not fit the ECC mask. This is partly caused by the sidelobe and the noise floor of the spectrum analyzer (gray circle). Under the conditions imposed by the regulation ($RWB = 1$ MHz), it was very difficult to achieve measurements with a noise floor below -85 dBm/MHz. This level is merely not attainable by most of the spectrum analyzers without the use of a wideband low noise pre-amplification device. Close to 3.4 GHz, a sidelobe also contributes to the energy exceeding the -85 dBm/MHz level but can however be reduced by applying a slightly narrower filter on the baseband pulse. The European rules actually allow a lower limit on the UWB signal bandwidth. The PLL reference frequency appears also here in the PSD measurements. However, this signal will be filtered at the emission by the bandpass characteristic of the antenna.

Although this mode has not been intended to be used in a complete Rx-Tx link in this work, measurement results are given here for the sake of completeness. The BPSK burst mode is an interesting and alternate way to generate UWB-like signals. The bandwidth extension does not rely on the short duration of the pulse, but on the high modulation rate of a CW burst. The long-duration burst can be used in a PPM modulation scheme to ease the implementation of a receiver (non-coherent demodulation). In this case, the BPSK modulation is not used to carry any data. This modulation scheme has been proposed in the IEEE 802.15.4a Standard.

The measurements given here make use of this method and also illustrate the maximum performance of the transmitter IC in terms of modulation rate for a BPSK scheme.

First, measurement of a BPSK-modulated CW wave in time domain is given in Fig 4.18. Owing to fast polarity changes shorter than 500 ps, the transmitter is able to emit BPSK signals at rates of up to 1Gchips/s. In this configuration, the circuit consumes 45mA of current from a supply voltage of 1.8V; this corresponds to an energy consumption of 81pJ/chip.

### 4.6.3. PMCW Measurements

Figure 4.19 illustrates in details the emission of the burst of an un-shaped BPSK-modulated CW signal. Sub-figure a) shows a burst of
unmodulated CW signal as generated by the PLL. In the second plot, we observe that the accuracy of the instantaneous frequency can be relaxed to $\pm 50$ MHz. This results in an accumulated phase jitter of less than $\pm \pi/4$ radians during the burst. For instance, this value would still allow coherent demodulation of the BPSK signal. The time $t_{PLL,ON}$, at which the PLL is switched on, has been deliberately set to a value that shows the effect of excessive jitter at start-up, as shown in plot c). However, applying duty-cycle on PLL can save up to one third of the power consumption. Further power reduction is achieved by enabling the PA only during bursts. At 1 Mbursts/s and assuming 16 chips/burst, this reduces the average current consumption by a factor four down to 5.5 mA, which is equivalent to an estimated energy requirement of 620 pJ/chip or 10 nJ/bursts (see Table 4.2).

4.6.4. LO leakage

Between each IR-UWB pulse or burst of transmission, the LO controlled by the PLL cannot ideally be switched off, especially when the pulse repetition rate is higher than 5 M$p/s$. At such low repetition rates, the LO signal energy that leaks through the output stage, may create a narrow spectral line in the output spectrum at the pulse’s center frequency. The LO leakage in the absence of baseband signal has been measured on a spectrum analyzer with minimal RBW (30 Hz) to detect the very small signal leakage that occurs with a disabled output stage. Figure 4.20 illustrates the isolation obtained by powering off the output stage between the pulse emission. A LO attenuation at the TX output of more than 50 dB is obtained.
Figure 4.19. Measured burst of unmodulated (a, b and c) and modulated (d) PMCW signal around 4.5 GHz. Subplot a) represents a burst of unmodulated pulses; b) and c) shows the corresponding instantaneous frequency and accumulated jitter, respectively; d) depicts a burst of 16 consecutive BPSK modulated pulses (chips) at a modulation rate of 500 MHz, the input data are given by dashed curve, whereas coherent demodulated data (with Matlab) are given by the continuous curve. The time $t_{PLL,ON} = -65 \text{ ns}$, at which the PLL is switched on, has been deliberately set to a value that shows the effect on the demodulated signal with excessive jitter during the start-up.
Figure 4.20. Measured LO leakage at a center frequency $f_c = 3.3$ GHz with enabled (dotted line) and disabled (continuous line) output stage.

4.7. Performance Comparison with Other Tx

This section reviews recently published state-of-the-art IR-UWB transmitters and compares them with the proposed solution in Table 4.1. A key figure of merit in today’s IR-UWB transceivers is the energy consumption per bit (J/b). Comparing energy consumption requires some attention; many publications show outstanding performances on transmitters running at pulse rates above GHz. This naturally drastically reduces the energy per bit sent, especially for devices that do not care about spectrum shape and/or center frequency. Moreover, the overall link complexity is reported on the associated receiver and is not reported in this figure of merit. There are many transmitter solutions in use today, each of them are addressing a particular application, such as data transmission, ranging, localization, radar, and combination thereof. For a fair comparison, we will focus ourselves on low to moderate data rate solutions, enabling WPAN transmission and ranging typically from 10 kb/s to 10 Mb/s. These kind of IR-UWB transmitters can be classified into two main families.
4.7. Performance Comparison with Other Tx

4.7.1. Oscillator-based

This family involves generating a pulse at baseband, filtering it to obtain the desired spectrum and upconverting it to a center frequency. They usually employ OOK, PPM or BPSK. Phase modulation can be either achieved at baseband before upconversion or on the RF carrier directly. Classical RF blocks such as VCO and mixers are usually employed:

- Reference [91] shows one of the best performances in terms of power consumption in this family, although providing phase coherence during a certain time. This time-limited coherence is provided by a phase-aligned frequency-locked loop (PA-FLL) synthesizer. The accumulated jitter remains below 6 ps RMS during 30 ns and accommodates burst emission of BPSK signals for the IEEE 802.15.4a IR-UWB standard. At 1 Mburst/s (1 burst = 1 bit), the power consumption drawn from a 1 V supply goes from 0.65 mW at 3.1 GHz to 1.4 mW at 10 GHz, leading to a 0.65 to 1.4 nJ/bit energy consumption (with time-limited coherent BPM-BPSK modulation).

- The work of [90] is also intended for IEEE 802.15.4a BPM-BPSK modulation. The transmitter section supports 12 channels and is based on an active Gilbert-cell mixer for upconversion. Modulation is achieved prior to upconversion. A free-running quadrature VCO is tuned to the desired channel by a frequency locked-loop (FLL) and then disabled to reduce power consumption. The accuracy is in the order of 10 MHz. The TX draws a current of 8 mA when generating pulses at PRR of 500 MHz. At 5% duty-cycle, 1 burst/bit and under a supply voltage of 1.8 V, the energy consumption reaches 0.72 nJ/bit.

- An other interesting approach uses wideband analog FM [99,100] and is targeted for low data rates up to 125 kbps in the newly available upper ECC band (see Section 2.3.5). The transmitter generates a very clean “brick-wall” spectrum between 7.2 and 7.7 GHz owing to the very wideband FM modulation and the triangular-shaped modulation signal. Similarly to the solution proposed in [92], the carrier generator (VCO) is used in open-loop mode and a calibration of the frequency by means of a PLL and ADC/DAC devices is realized periodically. The power consump-
tion is estimated to 5 mW for a maximum data rate of 125 kb/s (40 nJ/bit).

- A interesting advance in terms of power consumption has been presented by Mercier [101] from the MIT in 2009. This work presents a low data rate (< 15.6 MHz) transmitter with very low energy consumption in the order of 0.28 nJ/bit. The transmitter, realized with a CMOS 90 nm technology, generates non-coherent pulses by means of an inverter-based single-ended DCO, whose output is shaped by an innovative digital power amplifier. The latter consists in capacitively coupling two out-of-phase single-ended PA to remove low frequency spectral content (common-mode component) of the classical single-ended PA approach.

Other implementations make use of gated LC-oscillator, as published within the framework of this project [86] or in [102–104]. They generally lack flexibility in terms of spectrum shaping, modulation (PPM or OOK only) and frequency accuracy. The best reported FCC-compliant transmitter has been demonstrated by [103] and achieves an outstanding energy dissipation of 8 pJ/p at a 30 MHz PRR.

4.7.2. Fully-Digital Solutions

With the downscaling of transistor size below 100 nm, a new class of fully-digital transmitters appeared in 2007. Transit frequencies $f_T$ as high as 150 GHz enable the use of fully-digital CMOS topology to synthesize bandpass UWB pulses. This kind of transmitter is also known as carrier-less UWB synthesizers and generates pulses having the required bandpass spectral properties without any frequency translation. The clear advantage of the topologies is the zero static current between pulses. Recent work shows outstanding results in terms of power consumption down to tens of pJ/pulse, i.e. one order of magnitude below oscillator-based solutions. The most remarkable all-digital solution has been proposed by Wentzloff in early 2007 [105]. The circuit combines a series of equally delayed edges to form a single RF pulse directly in the band of interest. The combined edges are then eventually filtered on-chip [106] or buffered through a series of digital inverters to drive the antenna. This kind of transmitter, however, needs at least a high-pass filter to remove the low-frequency components induced by single-ended CMOS topologies. Moreover, they suffer from ill-defined spectral characteristics, such as excessive side lobes (estimated spectral roll-off of 13
dB/GHz) and excessive spectrum overlap, which make them inappropriate for ECC regulation. Another shortcoming is that, when bringing the transmitter section in a fully integrated transceiver, a LO must still be added for the receiver section to downconvert bandpass RF pulses. This technique also suffers from an excessive sensitivity to process and temperature variations. Other similar and significant approaches can be found in [107–110].

From Table 4.1, we observe that fully-digital solutions outperform analog and mixed mode topologies in terms of energy consumption per bit. The absence of static power consumption and the use of aggressive duty-cycle schemes are the main reasons for these performances. On the other hand, these solutions seem to show less frequency accuracy and especially worse spectral characteristics compared to analog and mixed mode solutions. In multi-channel topologies, the spectral roll-off is of concern. This property is actually dependent on the transmitted signal duration that must include ringings to reduce sidelobes. This actually comes at the cost of an increased duty-cycle and power consumption. Most of the published work (except the UWB-FM solution [99]) shows a spectral roll-off in the order of 25 to 30 dB/MHz; the proposed transmitter exhibits a roll-off in excess of 70 dB/GHz.

4.8. Summary and Conclusions

The realized circuit is shown in Fig. 4.21. It has been fully integrated in a 0.18-µm CMOS technology utilizing a die area of 2 mm$^2$. The choice of the technology was mainly dictated by its availability and cost at the start of the realization of the transmitter IC. One of the main targets in this project was to use a commercially available technology. Moreover, shifting towards smaller technology during the development would have forced to redesign some of the blocks (PLL). A second reason was the reliability of the transistor models that comes with a commercially available technology. In the realization of a complex system, the reliability of the design kit is much more important than for the design of simple circuits. The prototype chip features a PLL, a shaping $g_m$-C filter requiring a single external reference resistor for corner frequency setting, an up-conversion circuit with phase modulation capability and an output stage.

Table 4.2 summarizes the estimated current consumption for both measured configurations (BFSK and PMCW) described in Section 4.6. The detailed breakdown for baseband is based on estimation from sim-
ulations. The first measured configuration corresponds to a 5 Mp/s LRPM mode in channel C (BFSK modulation between 4.5 GHz and 4.8 GHz), which corresponds to a worst case (highest VCO frequencies). As seen in the table, almost two-third of the current is used for the carrier generation by the PLL. The latter device, and especially means to reduce the power consumption for IR-UWB applications, are further investigated in the next Chapter 5 and in Appendix D, respectively. The measured overall current is 10% higher than the simulated one; the differences comes from slightly higher bias currents required by the PLL (mostly CML) that has been set to increase the overall performance during the test procedures. The second column details the power consumption of the burst PMCW mode presented in Section 4.6.3.

The measured power consumption given in Table 4.2 actually does not take into account the potential improvement of the VCO calibration method proposed in [92]. This approach has not been implemented in the transmitter, but has been evaluated on the receiver side in Section 7.7. Using this approach and neglecting the calibration time, the average current consumption of the VCO can be reduced to 1.6 mW (16% duty-cycle). For LPRM, this reduces the transmitter energy consumption to 3 nJ/p. Applying the same duty-cycle scheme to the pulse shaping filter section would reduce the overall power consumption down to 6 mW (or 1.2 nJ/p at 5 Mp/s). Simulations shows slightly higher

Figure 4.21. Chip micrograph of the IR-UWB transmitter.
values due to the replica circuits that have to be continuously biased to enable fast common-mode settling.

The circuit demonstrates an excellent flexibility to accommodate many signalling schemes. It operates within the 3-5 GHz UWB band and can generate very clean and sufficiently accurate IR-UWB pulses with a frequency resolution of 150 MHz \( f_{ref} / 2 \), free from any spurious signal.
Table 4.1. State-of-the-art low data rate transmitters comparison†

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology/supply</td>
<td>analog CML</td>
<td>mixed CMOS/CML</td>
<td>analog CML</td>
<td>analog ext. loop filter</td>
<td>digital CMOS</td>
<td>digital CMOS</td>
</tr>
<tr>
<td>Architecture</td>
<td>fully integrated</td>
<td>fully integrated</td>
<td>fully integrated</td>
<td>130 nm/1.1V</td>
<td>fully integrated</td>
<td>ext. filter</td>
</tr>
<tr>
<td></td>
<td>hybrid</td>
<td>direct mod.</td>
<td>180 nm/1.8V</td>
<td>direct mod.</td>
<td>direct mod.</td>
<td>edge-comb.</td>
</tr>
<tr>
<td>Modulation</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>- phase</td>
<td>⊗ → BPSK</td>
<td>⊗ → BPSK</td>
<td>BPSK → ⊗</td>
<td>-</td>
<td>⊗ → BPSK</td>
<td>delay-based</td>
</tr>
<tr>
<td>- frequency</td>
<td>≥1 GHz</td>
<td>≥500 MHz</td>
<td>≥500 MHz</td>
<td>-</td>
<td>-</td>
<td>dithering</td>
</tr>
<tr>
<td>- amplitude</td>
<td>BFSK ⊗</td>
<td>-</td>
<td>-</td>
<td>FM ⊗</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>- position</td>
<td>5th-o. Cheb.</td>
<td>⊗ → DAM</td>
<td>AAM → ⊗</td>
<td>BPSK sub-carr.</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>4-level</td>
<td></td>
<td>1st-o.+biquad</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Carrier</td>
<td>Ring-VCO</td>
<td>19 bits ring-DCO</td>
<td>Ring-VCO</td>
<td>LC-VCO</td>
<td>ring-DCO</td>
<td>edge delay</td>
</tr>
<tr>
<td>Calibration</td>
<td>PLL (100ns)</td>
<td>FA-DLL</td>
<td>D-PLL</td>
<td>PLL+AD/DA</td>
<td>count./FLL</td>
<td>count. (62 µs)</td>
</tr>
<tr>
<td>Freq. range</td>
<td>3.3-4.8 GHz</td>
<td>3.1-10.6 GHz</td>
<td>3-9 GHz</td>
<td>7.2-7.7 GHz</td>
<td>3-5 GHz</td>
<td>3.2-4.9 GHz</td>
</tr>
<tr>
<td>Freq. resol.</td>
<td>150 MHz</td>
<td>4 MHz (13 bits)</td>
<td>10 MHz</td>
<td>64 MHz (PLL)</td>
<td>10 MHz</td>
<td>± 25 MHz</td>
</tr>
<tr>
<td>PA</td>
<td>class-AB</td>
<td>n/a</td>
<td>class-A</td>
<td>n/a</td>
<td>inverter</td>
<td>inverter</td>
</tr>
<tr>
<td></td>
<td>diff/dyn.bias</td>
<td>n/a</td>
<td>diff/switched</td>
<td>n/a</td>
<td>single-ended</td>
<td>single-ended</td>
</tr>
<tr>
<td>Power cons., peak</td>
<td>53.6 mW</td>
<td>15.1 mW (16%)</td>
<td>n/a</td>
<td>5 mW (est.)</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>@ duty-cycle</td>
<td>10.7 nJ/b (16%)</td>
<td>14.4 mW (5%)</td>
<td>n/a</td>
<td>40 nJ/b</td>
<td>0.28 nJ/burst</td>
<td>47 pJ/p</td>
</tr>
<tr>
<td>E/bit (100% duty-cycle)</td>
<td>3 nJ/b (16%)</td>
<td>0.65 nJ/b (3%)</td>
<td>n/a</td>
<td>0.72 nJ/b (5%)</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>@ data rate</td>
<td>5 Mb/s</td>
<td>1 Mb/s</td>
<td>1 Mb/s</td>
<td>125 kb/s</td>
<td>15.6 Mb/s</td>
<td>10 Mb/s</td>
</tr>
<tr>
<td>Compliance</td>
<td>FCC/ECC</td>
<td>FCC</td>
<td>FCC</td>
<td>FCC/ECC</td>
<td>FCC</td>
<td>FCC</td>
</tr>
<tr>
<td>Spectral roll-off</td>
<td>75 dB/GHz</td>
<td>30 dB/GHz</td>
<td>26 dB/GHz</td>
<td>&gt;500 dB/GHz</td>
<td>40 dB/GHz</td>
<td>13 dB/GHz</td>
</tr>
</tbody>
</table>

Legend : → ⊗: pre-carrier (up-conversion), ⊗ →: post-carrier modulation, ⊗: direct carrier modulation,

† Note that, for energy consumption per bit (b), the definition of a “bit” is either equivalent to a “pulse” for a Tx generating a train of isolated pulses, or a “burst of pulses” for solutions using the IEEE802.15.4a standard, whichever applies. Both definitions actually consider the definition of a “bit” with respect to a non-coherent demodulation.
4.8. Summary and Conclusions

Table 4.2. TX power consumption overview for $V_{supply} = 1.8$ V (unused blocks are not taken into account in the power budget).

<table>
<thead>
<tr>
<th></th>
<th>LRPM</th>
<th>PMCW</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Ch.C</td>
<td>$f_o = 4.5$ GHz</td>
</tr>
<tr>
<td></td>
<td>5 Mp/s</td>
<td>1 Mburst/s, 32 ns</td>
</tr>
<tr>
<td>Total TX</td>
<td>29.8 mA ($\approx 10.7$ nJ/b)</td>
<td>5.35 mA ($\approx 9.7$ nJ/b)</td>
</tr>
<tr>
<td>(measured, w/ PLL)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>PLL</td>
<td>23 mA (100%$^\dagger$)</td>
<td>2.8 mA (12%)</td>
</tr>
<tr>
<td>Baseband</td>
<td>5.8 mA</td>
<td>2.2 mA</td>
</tr>
<tr>
<td>- input stage</td>
<td>1.7 mA</td>
<td>-</td>
</tr>
<tr>
<td>- shaping filt.</td>
<td>1.9 mA</td>
<td>-</td>
</tr>
<tr>
<td>- buffers</td>
<td>1.5 mA</td>
<td>1.5 mA</td>
</tr>
<tr>
<td>- bias+other</td>
<td>0.7 mA</td>
<td>0.7 mA</td>
</tr>
<tr>
<td>Modulator &amp;</td>
<td>1 mA (10 %)</td>
<td>0.35 mA (3.2%)</td>
</tr>
<tr>
<td>output amp.</td>
<td>($\approx 17$ mA-pk)</td>
<td>($\approx 17$ mA-pk)</td>
</tr>
</tbody>
</table>

$^\dagger$Numbers in parenthesis are duty-cycle values.
5

The PLL

5.1. Introduction

At the beginning of the “commercial” UWB era in the United States in 2002, it has often been stated that UWB wouldn’t require any of classical narrowband RF blocks such as phase-locked loops (PLL) and up- and down-conversion mixers [111]. This statement is partially true, at least for systems using impulse-based transmission schemes relying on direct baseband signal synthesis and decoding.

As estimated in Fig. 3.31, the frequency accuracy specification for the carrier generation device can be set on the order of $\pm 15$ MHz for both the transmitter and the receiver, which corresponds to $\pm 3750$ ppm or $\pm 0.375\%$ at mid-band frequency of 4 GHz. This accuracy is far above the ability of classical PLL’s clocked by a quartz crystal (typ. $\pm 100$ ppm) and one may ask whether such a solution is overdone. However, in order to keep the maximum accuracy for test purposes in both the transmitter and the receiver, the carrier frequency generation device will be based on a PLL. First, we will investigate the requirements for such a solution to enable a 10 Mpulses/s BFSK modulation. Subsequently, we will push the idea of “ill-defined” frequencies for IR-UWB; in Appendix D, we propose a sufficiently accurate frequency generation technique based on a free-running oscillator calibrated by the analog-to-digital conversion of the VCO control voltage of the locked PLL. This technique may be used either or the transmitter and the receiver (see Section 7.7) and allows for a significant reduction in power consumption.

In this thesis, a RF PLL has been designed such that the nine car-
riers involved in the channels A, B and C (each comprising two carriers for BFSK modulation at the transmitter and one for the receiver’s LO) can be generated with the same oscillator and loop filter and without any calibration. This, for instance, greatly simplifies the replication of the carrier generation device for, e.g., simultaneous multi-channel operation.

The first open issue was the design of a controlled oscillator with a sufficient tuning range to cover the three bands between 3.3 and 4.8 GHz and with sufficient margin to compensate temperature and process variations. The wide tuning-range oscillator is described in Section 5.3. The second issue is the elaboration of a divider device able to provide at least the nine division ratios for Tx and Rx. This circuit is described in Section 5.4. In the following Section 5.2, we will first analyze the PLL requirements to enable the BFSK modulation at a rate of at least 10 MHz. A summary of the PLL specifications for both Tx and Rx modes is given in Table 5.1.

**Table 5.1. Summary of the PLL specifications**

<table>
<thead>
<tr>
<th></th>
<th>Tx</th>
<th>Rx</th>
</tr>
</thead>
<tbody>
<tr>
<td>Carrier frequencies</td>
<td>Ch. A: 3.3/3.6 GHz</td>
<td>3.45 GHz</td>
</tr>
<tr>
<td></td>
<td>Ch. B: 3.9/4.2 GHz</td>
<td>4.05 GHz</td>
</tr>
<tr>
<td></td>
<td>Ch. C: 4.5/4.8 GHz</td>
<td>4.65 GHz</td>
</tr>
<tr>
<td>Freq. switching rate</td>
<td>10 MHz (BFSK, ( f_0 = 300 \text{ MHz} ))</td>
<td>-</td>
</tr>
<tr>
<td>Freq. accuracy</td>
<td>±15 MHz (±5% of ( f_0 ))</td>
<td>±15 MHz</td>
</tr>
<tr>
<td>Phase Noise</td>
<td>-105 dBc/Hz at 150 MHz offset</td>
<td>-105 dBc/Hz</td>
</tr>
</tbody>
</table>

5.2. PLL Dimensioning

5.2.1. PLL Architecture

A second-order PLL is required for our application. Such a device features a theoretically infinite capture range, which is highly desired when using a VCO with wide tuning-ranges. Figure 5.1 depicts the simplified linear model of the PLL. It consists of a phase-frequency detector with a charge-pump device (PFD+CP), a loop filter (LF), a controlled
oscillator (VCO) driven by a control voltage $V_c$ and a programmable frequency divider by $N$ (DIV).

**Open-loop Equations**

The dynamic behavior is analyzed in the next sections with the help of Bode plots. Before evaluating the open-loop transfer function of the PLL, the impedance of the loop filter (dashed box of Fig. 5.1) is calculated. We obtain

$$Z_{LF}(s) = R_z + \frac{1}{sC_z} = \frac{1 + sR_zC_z}{sC_z} = R_z \frac{s + \omega_z}{s},$$  

(5.1)

where

$$\omega_z = \frac{1}{R_zC_z}$$  

(5.2)

is the zero of the transfer function and used to ensure stability of the loop. The PLL open-loop transfer function is, according to Fig. 5.1,

$$G_{ol}(s) = \frac{\phi_{ref}(s)}{\phi_{div}(s)} = K_{pfd} \cdot Z_{LF}(s) \cdot \frac{K_{vco}}{s} \cdot \frac{1}{N} = K_{pll} \frac{s + \omega_z}{s^2},$$  

(5.3)

where $K_{pfd}$ is the gain of the phase-frequency detector (with the charge-pump), which is, for a true phase-frequency detector, $I_{cp}/(2\pi)$ in
[A/rad], $I_{cp}$ is the charge-pump current. $K_{vco}$ is the VCO conversion gain in [rad/s/V] and N is the frequency division ratio. Note that in this simplified first-pass analysis, the reaction time of the VCO is assumed as negligible. Simulations of the time constant associated with the VCO will confirm this hypothesis. $K_{pll}$, the PLL gain, reads

$$K_{pll} = \frac{K_{pfd} K_{vco} R_z}{N}. \quad (5.4)$$

We define the crossover frequency $\omega_c$, i.e. the frequency at which the open-loop gain is unity, which is solved from $|G_{ol}(j\omega_c)| = 1$, as

$$\omega_c = \sqrt{\frac{K_{pll}^2 + K_{pll} \sqrt{K_{pll}^2 + 4 \omega_z^2}}{2}}. \quad (5.5)$$

We will see in the next section how the crossover frequency is related to the bandwidth of the PLL. This relationship has been used during the design process of the PLL, it allows to design each block independently (open-loop), while giving indications on the dynamic behavior in closed-loop mode.

### Closed-loop Equations

The second-order closed-loop PLL equation is

$$G_{cl}(s) = \frac{\theta_{out}(s)}{\theta_{ref}(s)} = \frac{\omega_{out}(s)}{\omega_{ref}(s)} = \frac{G_{ol}(s)}{1 + G_{ol}(s)} = \frac{K_{pll}(s + \omega_z)}{s^2 + K_{pll}s + K_{pll}\omega_z}. \quad (5.6)$$

This equation can be expressed in terms of damping factor $\zeta$ and natural frequency $\omega_n$ as

$$G_{cl}(s) = \frac{2\zeta \omega_n s + \omega_n^2}{s^2 + 2\zeta \omega_n s + \omega_n^2}, \quad (5.7)$$

where

$$\omega_n = \sqrt{K_{pll}\omega_z} \quad (5.8)$$

and

$$\zeta = \frac{1}{2}\sqrt{\frac{K_{pll}}{\omega_z}} = \frac{R_z}{2} \sqrt{\frac{K_{pfd} K_{vco} C_z}{N}}. \quad (5.9)$$
The closed loop gain has a low-pass characteristic. The -3dB bandwidth of the PLL $\omega_{\text{pll}}$ is expressed as

$$\omega_{\text{pll}} = \omega_n \cdot \sqrt{1 + 2\zeta^2 + \sqrt{(2\zeta^2 + 1)^2 + 1}}.$$  \hspace{1cm} (5.10)

For illustration purposes, Fig. 5.2 depicts the relationship between the crossover frequency $\omega_c$ and the PLL bandwidth $\omega_{\text{pll}}$ with respect to $\zeta$. The graph seems to show that an underdamped closed loop system provide a favorable bandwidth extension for the fast frequency modulation. However, this bandwidth extension is accompanied by an overshoot in the transient response, which is actually not desired for our application. An excessive frequency overshoot may shift the instantaneous frequency during a pulse transmission and make the spectrum of the composite IR-UWB signal broader or narrower than expected. This aspect is investigated in detail in the next Section 5.2.2.
5.2.2. Settling Time for Direct Modulation

In order to enable direct 10 Mp/s BFSK modulation with a single PLL synthesizer, the settling time between each frequency change must be theoretically lower than 80 ns. As illustrated in Fig. 4.16, an overall time of approximately 20 ns is needed during the pulse emission; this time includes ringings caused by the shaping filter and some margin. Thus, the accuracy of the instantaneous frequency after a settling time of 80 ns must be smaller than $\pm 15$ MHz. This corresponds to 5% of the BFSK modulation offset $f_0$ and has been fixed accordingly to the tolerated BER degradation caused by a potential center frequency offset between Tx and Rx (see Fig. 3.31). We assume that this frequency difference is equally shared between the receiver and the transmitter. From the linear model, we find out that the phase error function $\phi_e(s)$ of a second order PLL is given by

$$\phi_e(s) = \frac{s \cdot \phi_{\text{ref}}}{s + K_{\text{vco}} \cdot K_{\text{pfd}} \cdot Z_{\text{LF}}(s)/N}. \quad (5.11)$$

By using the relationship between the phase and the frequency, the frequency error $\omega_e(s)$ is

$$\omega_e(s) = s \cdot \phi_e(s) = \frac{s^2 \cdot \phi_{\text{ref}}}{s + K_{\text{vco}} \cdot K_{\text{pfd}} \cdot Z_{\text{LF}}(s)/N}$$
$$= \frac{s \cdot \omega_{\text{ref}}}{s + K_{\text{vco}} \cdot K_{\text{pfd}} \cdot Z_{\text{LF}}(s)/N}, \quad (5.12)$$

where $\omega_{\text{ref}} = \frac{\Delta \omega}{s}$ is a step function, whose step amplitude $\Delta \omega$ correspond to the frequency offset $2\pi f_o$, as defined in Fig. 4.3. Finally, by using the formalism of the previous section ($\omega_n$ and $\zeta$), the frequency error equals

$$\omega_e(s) = \frac{\Delta \omega}{s + K_{\text{vco}} \cdot K_{\text{pfd}} \cdot Z_{\text{LF}}(s)/N} = \frac{\Delta \omega \cdot s}{s^2 + 2\zeta \omega_n s + \omega_n^2}. \quad (5.13)$$

The temporal expression of the frequency error $\omega_e(t)$ is obtained by applying the inverse Laplace transform to Eq. 5.13. The error $\omega_e(s)$ is a rational fraction that can be decomposed into more simple functions by application of partial fraction decomposition. The partial fraction decomposition leads to three cases, depending on the pole at the denominator of Eq. 5.13:

1. single root, or $4\omega_n^2 (\zeta^2 - 1) = 0$ or $\zeta = 1$ (critically damped);
2. two real roots, or $4\omega_n^2(\zeta^2 - 1) > 0$ or $\zeta > 1$, (overdamped);
3. two complex conjugate roots, or $4\omega_n^2(\zeta^2 - 1) < 0$ or $\zeta < 1$, (underdamped).

The solutions given by the inverse Laplace transform of the frequency error function $\omega_e(s)$ are:

$$
\omega_e(t) = \begin{cases} 
\Delta \omega \cdot e^{-\zeta \omega_n t} \cdot \left[ \cos (\omega_n t \sqrt{1 - \zeta^2}) - \ldots \right] & \zeta < 1 \\
\frac{\zeta}{\sqrt{1-\zeta^2}} \sin (\omega_n t \sqrt{1 - \zeta^2}) & \zeta = 1 \\
\Delta \omega \cdot e^{-\zeta \omega_n t} \cdot \left[ \omega_n t - 1 \right] & \zeta = 1 \\
\Delta \omega \cdot e^{-\zeta \omega_n t} \cdot \left[ \cosh (\omega_n t \sqrt{\zeta^2 - 1}) - \ldots \right] & \zeta > 1 \\
\frac{\zeta}{\sqrt{\zeta^2 - 1}} \sinh (\omega_n t \sqrt{\zeta^2 - 1}) & \zeta > 1 
\end{cases}
$$

**Figure 5.3.** Frequency error function $\omega_e(t)$ for a natural frequency $f_n = 12$ MHz and different damping factors $\zeta = 0.4, 1, 1.6$ corresponding to underdamped, critically damped and overdamped loop, respectively. The input signal corresponds to an input frequency step of $\Delta f = 300$ MHz at $t = 0$. The underdamped case shows the strong degradation in the settling time due to a second overshoot around 80 ns.
For illustration purposes, the frequency error function from a frequency step of 300 MHz is simulated for different damping factor $\zeta$ in Fig. 5.3. The natural frequency is $f_n = 12$ MHz. We observe that an excessively underdamped loop cannot meet the desired accuracy in less than 80 ns.

### 5.2.3. Loop delay

The dynamic behavior of the PLL is also influenced by time delays in the loop. Loop delays are due to propagation times through different devices and are more particularly originated in the divider and the oscillator. For the latter device, the response time also includes a low-pass behavior that can be considered as an additional delay. The effect of a delay on the lock-in time is investigated by means of simulations. Figure 5.4 shows the PLL lock-in times as a function of the normalized PLL bandwidth $f_{PLL}$ and loop time delays $t_d$. The normalization parameter is the desired lock-in time $t_{lock-in,spec}$. As analyzed in the previous section (case without time delay, continuous curves), the lock-in time drastically increases for $\zeta < 0.5$ due to overshoots of the PLL transient.

The net effect of a time delay in the loop is the reduction of the phase margin, which prevents the attenuation of the overshoots in time domain. We notice that for a time delay equivalent to 5% of the desired lock-in time (dash-dotted curve), the minimum damping factor $\zeta$ to obtain a settling time meeting the specification is increased to 0.66 (vertical thin dashed line). For a lock-in time of 80 ns with accuracy better than $\pm 15$ MHz (i.e. 5% accuracy), the minimum required PLL bandwidth is thus $f_{PLL} > 1.7/80 \approx 21.3$ MHz.

Equivalently, direct modulation meeting the specification is obtained by choosing the damping ratio and the natural frequency as (Eq. 5.10)

\[
\zeta \geq 0.5, \quad (5.15)
\]

\[
\omega_n \geq 2\pi \cdot (12 \text{ MHz}) \approx 75.4 \cdot 10^6 \text{ rad/s.} \quad (5.16)
\]

### 5.2.4. Stability Criterion

An important criteria of stability in a charge-pump PLL determines the PLL’s reference frequency $f_{ref}$ and comes from the fact that the phase detector and the charge-pump work in the discrete-time domain.
5.2. PLL Dimensioning

Figure 5.4. PLL bandwidth selection and influence of a time delay in the PLL.

The PLL operates as a sampled system and not as an ideal continuous-time circuit. The PLL may become unstable if the loop gain is made so large that its closed loop bandwidth becomes comparable to the reference frequency. Gardner’s stability criterion [112] states that there is an upper boundary for the loop gain that can be used with a given reference frequency $\omega_{\text{ref}}$:

$$\frac{K_{\text{pll}}}{\omega_z} < \frac{1}{\pi \omega_z (1 + \frac{\pi \omega_z}{\omega_{\text{ref}}})} = \frac{\omega_{\text{ref}}^2}{\pi \omega_z^2 (\pi + \omega_{\text{ref}}/\omega_z)}.$$  (5.17)

With the help of the Eq. 5.8, the above equation can be expressed as

$$\omega_n^2 < \frac{\omega_{\text{ref}}^2}{\pi (\pi + \omega_{\text{ref}}/\omega_z)}.$$  (5.18)

By replacing $\omega_z$ by $\omega_n^2/K_{\text{pll}}$ in the previous equation, it is possible to obtain an equation that defines the loop gain $K_{\text{pll}}$ and the reference frequency $\omega_{\text{ref}}$ with respect to the natural frequency $\omega_n$:

$$\omega_n < \frac{1}{\pi} \sqrt{\omega_{\text{ref}} \cdot (\omega_{\text{ref}} - \pi K_{\text{pll}})}.$$  (5.19)
By reporting this conditional equation in the \([f_{\text{ref}}, K_{\text{pll}}]\) design space, and together with the conditions given in Eq. 5.15 and 5.16, we obtain contours and areas of Fig. 5.5. This figure depicts how the reference frequency in a charge-pump PLL has to be chosen. The horizontal axis represents the reference frequency in MHz; since it is assumed that multiplication- or division-by-two is much easier to implement in a circuit, only multiples or \(2^{-n}\)-submultiples of the offset frequency \(f_o = 300\) MHz are shown as potential reference frequencies \(f_{\text{ref}}\) for the PLL (i.e. 600 MHz or 150, 75 and 37.5 MHz, respectively). The PLL open-loop gain \(K_{\text{pll}}\) is represented on the vertical axis.

![Figure 5.5. Choice of a PLL reference frequency. For the required dynamic behavior, the reference frequency and the PLL gain can only be chosen within the allowed design space delimited by white area.](image)

The dark gray shaded area on the left hand side corresponds to the Gardner’s criterion expressed with the natural frequency (Eq. 5.19). It states that, regardless of the damping factor, the minimum reference frequency is 75 MHz. The condition given by Fig. 5.4 however limits the damping factor \(\zeta\) to a value larger than 0.5 (without delay in the loop). This defines an additional forbidden area (light gray), which brings the minimal reference frequency to 150 MHz. At this reference frequency, the open-loop gain of the PLL may vary by about 16 dB from 150
to 166 dB. From Eq. 5.9, process variations on the filter components \( R_z \) and \( C_z \) potentially change the value of the damping factor \( \zeta \) by \( \pm 12\% \) and \( \pm 6\% \), respectively. The total variation may reach \( \pm 20\% \), which corresponds to \( \pm 2 \) dB. An additional 3 dB variation come from the variation of the division ratio \( N \) which can be set between 11 and 16. This leaves less than 9 dB for variation of \( K_{vco} \) and temperature effects. This margin is further reduced by any delay in the loop which increases the equivalent damping factor as illustrated by the dashed line and arrow. Therefore, a reference frequency of at least 300 MHz has to be chosen for the PLL. This reference frequency will allow sufficient margin for the design tolerances.

5.3. The Wideband Oscillator

5.3.1. Introduction

The topology used is a ring oscillator. Several considerations have dictated the choice of such topology. The main one is that LC-oscillators generally suffer from a reduced tuning range due to the small capacitance range of varactors. Using a standard SOI CMOS process, state-of-the-art single LC tunable oscillators reach slightly more than 50\% of tuning range [113, 114] with the help of band switching. Wider ranges can be obtained with multiple LC-oscillators, which increase the silicon area and the design complexity. A goal in this thesis was to investigate the possibility of covering the entire UWB band between 3 and 5 GHz, including process, (supply-) voltage and temperature variations (PVT), with a single oscillator. With these considerations, we will see that the tuning-range of the previously cited LC-oscillators still stays below our requirements. Furthermore, amplitude stability over PVT is difficult to control in LC-oscillators and quadrature signals generation for the receiver requires more complex topologies.

The second main concern in wideband oscillator is the phase noise. LC-oscillators are primarily used for narrow band systems. Close-in phase noise of PLL is one of the most important parameters. We have seen, in Section 3.10.4, that the requirement on the phase noise in the proposed carrier-based IR-UWB receiver is considerably relaxed. Flicker noise and close-in PLL phase noise won’t affect the system performance owing to the large modulation index used by the proposed BFSK. Thus, only phase noise at large offset frequencies is critical and has to be considered.
5.3.2. Ring Oscillator

Steady oscillations in an inverter-based ring requires a total phase shift around the loop of $360^\circ$ at the frequency of oscillation and a loop gain larger than 0 dB. In a high-frequency ring oscillator, both phase and gain transfer function must be considered. The transfer function of the inverter at the frequency of oscillation determines the amount of phase shift introduced by each stage. A differential ring oscillator with four stages is particularly useful for quadrature signal generation as required by our receiver.

Oscillation Frequency

For a 4-stage ring circuit to oscillate and to provide quadrature outputs, each stage must contribute to a phase shift of $45^\circ$. In order to reach a phase shift of $360^\circ$ around the loop, an additional phase inversion of $180^\circ$ is achieved by swapping the feedback lines of the differential architecture, as illustrated in Fig. 5.6. With the assumption that each capacitively loaded inverter can be modeled by a first order low pass function, the frequency at which a phase shift of $45^\circ$ occurs is the cut-off frequency, thus

$$f_{osc} = f_{-3\,dB} = \frac{1}{2\pi R_L C_L}. \quad (5.20)$$

Since the gain of the passive network at this frequency is $\frac{1}{\sqrt{2}}$, the minimum DC voltage gain at each stage is then $A_{v,\,min} = \sqrt{2}$ (or +3 dB). At the frequency of oscillation $f_{osc}$, the complex modulus $|Z_L|$ of the load impedance seen by the transconductance of the differential pair is $R_L/\sqrt{2}$ (the capacitance $C_L$ has a purely imaginary impedance whose modulus is $R_L$). If the differential pair experiences complete current switchings, the peak-to-peak amplitude of the voltage swing at one output $V_{o+}$, resp. $V_{o-}$, is

$$V_{pp} = I_b \cdot \frac{R_L}{\sqrt{2}}. \quad (5.21)$$

By replacing the expression of $R_L$ in Eq. 5.20 with the above equation, we obtain the expression defining our current-controlled ring-oscillator

$$f_{osc} = \frac{I_b}{2\pi \sqrt{2} \cdot C_L \cdot V_{pp}}. \quad (5.22)$$
The frequency of oscillations $f_{\text{osc}}$ shows a linear dependence on the bias current $I_b$. Furthermore, it can be made stable for a given bias current $I_b$, if the swing $V_{\text{pp}}$ can be made stable over supply and temperature variations.

**Voltage Swing $V_{\text{pp}}$**

If each stage undergoes a complete switching, the outputs of each side of the delay cell will vary between $V_{dd}$ and $V_{dd} - V_{\text{pp}}$. We first determine the maximum voltage swing possible for such a topology such that the transistors of the differential pair ($M1, M2$) stays in saturation (we assume that oscillations do not experience strong clipping, and consequently harmonics generated by nonlinearities are small, this characteristic is highly desired to avoid parasitics spurs at transmission or down-conversion of noise around LO harmonics in the receiver chain). In a four-stage ring oscillator, when the output of one stage
(drain) is at its minimum level $V_{\text{min}}$, the input level (gate) is at $V_{\text{min}} + V_{pp} \cdot \sin \pi/4 = V_{\text{min}} + V_{pp}/\sqrt{2}$. Consequently, the gate-drain voltage difference $V_{pp}/\sqrt{2}$ at each transistor must not exceed one threshold voltage $V_{th,n}$ and the maximum voltage swing is

$$V_{pp} < \sqrt{2}V_{th,n}, \quad (5.23)$$

where $V_{th,n}$ is given by the technology parameters (here approximately 400mV for a minimum length NMOS transistor). This value is slightly increased if we consider the bulk effect; assuming a minimum voltage of 350 mV for the current bias to stay in the saturation region, the body effect coefficient of the technology increase slightly the threshold voltage $V_{th,n}$ to 450 mV. The maximum differential swing voltage, when considering transistors operating mode is thus $V_{pp,\text{diff}} < 2\sqrt{2}V_{th,n} \approx 1.27 \text{ V}$, which is sufficient for our application.

5.3.3. Core Inverter Cell Optimization for Tuning Range

This section investigates the optimization of the inverter delay cell in terms of transistor sizing, operating frequency range and current consumption. As seen before in Eq. 5.22, the voltage swing $V_{pp}$ must be made stable to obtain a oscillation frequency linearly dependent upon the bias current. The amplitude stabilization circuit is shown in Fig. 5.7. This is achieved through the implementation of a replica bias circuit and an OTA. The load resistor will be implemented with PMOS transistors M3, M4. The used approach is to set the load resistance of the PMOS such that oscillations have a constant swing. The load resistance is defined by the gate voltage of the PMOS $V_g$. A replica half-circuit with an active transistor Mr1 having its gate connected to the drain generate a replica of the common-mode voltage $V_{cm}$. This common-mode voltage is compared with a external reference voltage $V_{\text{ref}}$. The difference signal provided by an OTA is a current loading the gate-source capacitance of the PMOS load to obtain the desired gate voltage $V_g$. The PMOS gate voltage $V_g$ must stay within a range, which is limited by

- minimum output voltage of the OTA driving the PFET gate, on the lower side, and
- by the conditions of operation of the PMOS which must stay in the triode region, when increasing $V_g$. 
Consequently, we obtain

\[ V_{\text{out, min, OTA}} < V_g < V_{dd} - V_{pp} - V_{th,p} \] (5.24)

This range actually fixes the tuning range of our oscillator, which depends mainly on the supply voltage and the oscillations amplitude. In the next Sections, we propose a graphical method to optimize the oscillator in terms of tuning range under different operating conditions of PVT. More particularly, the proposed methodology will allow us to choose the correct transistor sizes and bias current \( I_b \). It will also enable us to answer a very fundamental question in wideband RF signal synthesis: how many oscillators are required to cover a given frequency range?

**Inverter Cell Equivalent Circuit**

We first define the equivalent PMOS resistance; the intrinsic channel resistance \( R_{\text{int}} \) is expressed as

\[ R_{\text{int}} = \left( \mu_{\text{eff,p}} C_{\text{ox}} \frac{W_p}{L_p} (V_{dd} - V_g - V_{th,p} - \frac{\Delta V_{\text{osc}}}{2}) \right)^{-1}, \] (5.25)

where \( \frac{\Delta V_{\text{osc}}}{2} \) is used here instead of \( V_{ds} \) to obtain an average value of the resistance, \( V_g \) is the gate voltage referred to ground. The total resistance can be approximated with the help of the BSIM3 expression which includes the extrinsic drain and source resistance \( R_{dsw,p} \).
expressed in $\Omega \cdot \mu$m (and denoted $R_{\text{ext}}$ in the Design Manual [115]):

$$R_L = \frac{R_{\text{int}}}{1 - \frac{R_{dsw,p}}{W_p \cdot 10^6 \cdot R_{\text{int}}}}. \quad (5.26)$$

The total load capacitance $C_L$ is defined as sum of the different capacitances seen at the output node. NMOS gate-drain and gate-source capacitance are considered twice to take into account both the effects of the following stage and the output buffer. For the PMOS load, gate-drain capacitance is considered because gate is connected to an AC ground ($V_g$) and the bulk-drain capacitance is loading the output node. Thus we have for the total load capacitance:

$$C_L = 2 \cdot (C_{gs,n} + C_{gd,n}) + C_{gd,p} + C_{bd,p} + C_{\text{par}}, \quad (5.27)$$

where $C_{\text{par}}$ is the parasitic capacitance associated with metal wires interconnecting the different stage and buffer and is in the order of a few femto-Farads. The different expressions for the capacitance are modeled with the following equations:

$$C_{gs,n} = \frac{2}{3} C_{\text{ox}} W_n L_n + W_n C_{ov,n},$$

$$C_{gd,n} = W_n C_{ov,n} (A_v + 1),$$

$$C_{gd,p} = \frac{3}{4} A_{\text{bulk}} W_p L_p C_{\text{ox}} + W_p C_{ov,p},$$

$$C_{bd,p} = \frac{2}{3} L_{s,p} W_p C_{jbo,p} + \frac{2}{3} C_{jswo,p} (2L_{s,p} + W_p),$$

where gate-drain capacitances are Miller capacitances multiplied by the voltage gain $A_v$ of the loaded inverter cell.

The goal of the next Sections is to find a way to define optimally the NFET and PFET transistor size of the inverter cell for a maximum tuning range under several condition (supply voltage, temperature and process variations).

**Tuning Range**

The method used here is a graphical method based on the equations of the previous section. Figure 5.8-a) shows the frequencies corresponding to the minimum $V_g$ defined by the Eq. 5.24 for different $W_p$ and $W_n$ at high temperature ($T=80^\circ \text{C}$) and slow process (-1$\sigma$). The dark gray area defines the design space where no oscillation occurs due to insufficient
loop gain. The light gray area represents the design space having an oscillation frequency higher than the minimum requirement of 4.8 GHz. Therefore, the transistor widths $W_p$ and $W_n$ must be chosen within this area. An example is shown for $W_p = 9\mu m$ and $W_n = 12\mu m$, an oscillation frequency of 5.3 GHz is reached; this requires an inverter cell’s bias current of 1.45 mA. We will see in Section 5.3.4 how the phase noise requirement further dictates the choice of transistor size and bias current. The same procedure is repeated for opposite extreme PVT for the minimum frequency of 3.3 GHz. Figure 5.8-b) shows a light gray area representing the possible transistor sizes. The previously chosen values meet the requirement and the bias current is now 0.78 mA.

How Many Oscillators are Needed?

We are now able to answer the question about the number of oscillators needed to cover our frequency range under extreme PVT. With respect to the tuning range parameter, our application requires a single 4-stage ring oscillator. Note that the $1\sigma$ worst case variation used in the graphical approach is defined as a set of positive or negative $1\sigma$ variation on the different process parameters chosen with the correct sign leading to the worst case scenario (similarly to the classical corner analysis in IC design). The process parameter considered here are gate length variation, mobility and oxide thickness. In a first approximation, process variations act essentially on the load capacitance while temperature influences the technology transconductance, thus affecting the differential pair transconductance and the equivalent load resistance. A minimal supply voltage of 1.7 V has also been considered in the calculation to reduce the effective tuning range to its minimum.

5.3.4. Phase Noise

When using automatic level control to fix the amplitude of oscillations, the internal signal swing can be limited such that the transistor devices are hardly completely turned on and off. As a result, a classical noise model (also called linear time-invariant or LTI), such as proposed by Razavi [116, 117], gives a reasonably good prediction for phase noise. The typical accuracy is in the order of 1 dB for a 4-stage ring oscillator. It has also been demonstrated that LTI phase noise models slightly overestimate the noise when the oscillator enters a non-linear mode of operation. Thus, LTI model represents an ideal tool for a first-pass analysis. According to Razavi, the total output noise voltage density in
Figure 5.8. Graphical method to optimize the value of the transistor size $W_p$ and $W_n$ under extreme conditions of PVT for a) $f_{\text{max}} = 4.8$ GHz and b) $f_{\text{min}} = 3.3$ GHz. The chosen value is located by a square marker and meets the requirement in terms of tuning range with additional margin.
5.3. The Wideband Oscillator

\( V^2/\text{Hz} \) at an offset \( \Delta \omega \) from the carrier frequency \( \omega_0 \), for a four-stage ring oscillator, is

\[
|V_{n,tot}(j(\omega_0 + \Delta \omega))|^2 \approx \frac{R_L^2}{4} I_n^{-2} \left( \frac{\omega_0}{\Delta \omega} \right)^2,
\]

(5.28)

where \( R_L \) is the equivalent resistance of the PMOS load seen at the output of the active stage (Eq. 5.26) and \( I_n^{-2} \) is the equivalent noise current density injected at each node of the ring oscillator by the active stage and the PMOS loads.

Figure 5.9 illustrates how the phase noise governs the choice of transistor widths and bias current for the 4-stage ring oscillator. Dotted contours represents the calculated phase noise over the entire design space according to Eq. 5.28. With the previous example (square marker, \( W_p = 9 \mu m \) and \( W_n = 12 \mu m \)), the phase noise has been calculated around -110 dBc/Hz at a frequency offset of 150 MHz. A second example with halved transistor sizes is given on the same plot by a round marker. In this case, the bias current can be reduced but the phase noise requirement is no longer met.

5.3.5. Simulation Results of the Oscillating Core

Starting from the considerations above and after some minor optimization with Cadence, we simulate an oscillator core with the following parameters:

- NMOS transistor size: \( W_n = 5 \times 2.3 = 11.5 \mu m \), \( L = 180 nm \);
- PMOS transistor size: \( W_p = 4 \times 2.3 = 9.2 \mu m \), \( L = 180 nm \);

The simulation results for the frequency of oscillation and VCO gain \( K_{\text{vco}} \) over temperature range from \(-20^\circ C\) to \(+80^\circ C\) (industrial range) are shown in Fig. 5.10. Since the VCO is not equalized, the variation of the VCO gain over the frequency range between 3 and 5 GHz may affect the PLL step response. At ambient and large temperatures, \( K_{\text{vco}} \) varies between -1 and -3 GHz/V. This value corresponds roughly to 10 dB in the loop gain and explains partly why a reference frequency of 300 MHz instead of 150 MHz has been chosen (Fig. 5.5). Note that to stabilize the loop dynamic over the entire bandwidth, the loop gain can be equalized by changing the gain of the PFD \( K_{\text{pfd}} \). This method is easy to implement, since \( K_{\text{pfd}} \) can be slightly varied according to the division ratio to compensate for non-flat \( K_{\text{vco}} \).
Figure 5.9. Graphical representation of the phase noise at an offset frequency $\Delta \omega / 2\pi = 150$ MHz in the $[W_p, W_n]$ design space for nominal process variation and high temperature. The markers correspond roughly to an oscillation frequency of $4$ GHz.

The influence of a time delay in the PLL loop has been investigated in Section 5.2.3. In order to estimate the amount of delay introduced by the VCO device, post-layout simulations have been performed and are given in Fig. 5.11. We observe that, for a frequency step comparable to the modulation offset $f_0$, the effective time delay is smaller than 2 ns; this corresponds to 2.5% of the targeted lock-in time. The effect of this time delay on the PLL behavior is depicted by the dashed curve in Fig. 5.4. Such a value affects the transient behavior of the PLL only slightly.

5.4. The Frequency Divider

5.4.1. Proposed Architecture

The proposed PLL will be employed for both the BFSK modulation (Tx) and the LO generation (Rx) and has to synthesize frequencies from 3.3 to 4.8 GHz in steps of 150 MHz. Assuming a reference frequency
5.4. The Frequency Divider

Figure 5.10. Post-layout simulations of the VCO oscillation frequency and gain $K_{vco}$. For different temperatures (oscillator core loaded with buffers). Measurements (dot markers) are derived from PLL measurements provided later in Section 5.5.1.

$f_{ref} = 300$ MHz, as stated in Fig. 5.5, the divider consequently has to provide division ratios from 11 to 16 in half-unit steps.

The block schematic of the entire frequency divider is depicted in Fig. 5.12. The division ratios are synthesized by the combination of a high-frequency tri-modulus prescaler (gray-shaded area) driven by a dual-modulus divider (rightmost block). The latter provides driving
Chapter 5: The PLL

The tri-modulus prescaler is based on a topology derived from the phase-rotator (or phase-switching) approach first proposed in [118]. In this work, this solution has been enhanced to provide three division ratios, namely $\div 3$, $\div 3.5$ and $\div 4$, instead of two. As detailed later, this improvement enables the $\div 3.5$ ratio required for half-unit steps.

The principle of the division operation is described hereafter by means of an example for the division ratio 11 ($f_{vco} = 3.3$ GHz). The dual modulus divider is configured to provide a division-by-3 with an output signal ($DIV_{out}$) having a 33%-66% duty-cycle. During the first 33% of the $DIV_{out}$ signal (= 1 “cycle”), the tri-modulus provides a division ratio $\div 3$, whereas $\div 4$ is selected for the remaining 66% of the output signal (= 2 “cycles”). This is equivalent to count three VCO periods during one third of the $DIV_{out}$ signal and four VCO periods during 2/3 of $DIV_{out}$ signal. Thus eleven VCO periods are occurring during one period of the $DIV_{out}$ signal. The timing diagram of this example is illustrated in Fig. 5.13

An advantage of this topology is that the conventional synchronous

Figure 5.11. Transient behavior of the VCO; the equivalent time delay introduced by the VCO is smaller than 2 ns.
5.4. The Frequency Divider

**Figure 5.12.** Top-level block schematic of the frequency divider. The high-frequency tri-modulus prescaler (gray-shaded area) comprises a fixed-modulus prescaler, fed by the VCO signal \( VCO_{in} \), and a phase rotator (dashed rectangle). The latter device consists of a phase selector and its logic driver. The low frequency part is formed by a dual-modulus prescaler (right-hand block), which also provides the output signal \( DIV_{out} \) to the phase-frequency detector.

**Figure 5.13.** Timing diagram of the division principle for \( \div 11 \).

divide-by-3/4 device comes into play only at moderate frequencies, right after the divide-by-four high-speed prescaler and the phase selector. To provide each of the specified division ratios, it has been found that the dual-modulus divider has to feature the following division modes and
duty-cycles:

1. \( \div 3 \) with a 33%-66% duty-cycle, and

2. \( \div 4 \) with 25%-75% duty cycle.

The arrangement of the two frequency dividers actually enables the synthesis of 15 division factors between 9 and 16 in half-unit steps. Table 5.2 describes the set of division ratios required to cover the lower end of the UWB band as defined in Section 4.3.2.

**Table 5.2. Division modes for Tx and Rx configurations**

<table>
<thead>
<tr>
<th>( \div ) ratio</th>
<th>Dual-modulus</th>
<th>Tri-modulus</th>
<th>Frequency</th>
</tr>
</thead>
<tbody>
<tr>
<td>Tx: 11</td>
<td>/3 (2c.( ^\S )+1c.)</td>
<td>2c. /4 + 1c. /3</td>
<td>3.30 GHz</td>
</tr>
<tr>
<td>Rx: 11.5</td>
<td>/3</td>
<td>2c. /4 + 1c. /3.5</td>
<td>3.45 GHz</td>
</tr>
<tr>
<td>Tx: 12</td>
<td>/3</td>
<td>2c. /4 + 1c. /4</td>
<td>3.60 GHz</td>
</tr>
<tr>
<td>n/u( ^\dagger ): 12.5</td>
<td>/4 (3c.+1c.)</td>
<td>3c. /3 + 1c. /3.5</td>
<td>3.75 GHz</td>
</tr>
<tr>
<td>Tx: 13</td>
<td>/4</td>
<td>3c. /3 + 1c. /4</td>
<td>3.90 GHz</td>
</tr>
<tr>
<td>Rx: 13.5</td>
<td>/4</td>
<td>3c. /3.5 + 1c. /3</td>
<td>4.05 GHz</td>
</tr>
<tr>
<td>Tx: 14</td>
<td>/4</td>
<td>3c. /3.5 + 1c. /3.5</td>
<td>4.20 GHz</td>
</tr>
<tr>
<td>n/u( ^\dagger ): 14.5</td>
<td>/4</td>
<td>3c. /3.5 + 1c. /4</td>
<td>4.35 GHz</td>
</tr>
<tr>
<td>Tx: 15</td>
<td>/4</td>
<td>3c. /4 + 1c. /3</td>
<td>4.50 GHz</td>
</tr>
<tr>
<td>Rx: 15.5</td>
<td>/4</td>
<td>3c. /4 + 1c. /3.5</td>
<td>4.65 GHz</td>
</tr>
<tr>
<td>Tx: 16</td>
<td>/4</td>
<td>3c. /4 + 1c. /4</td>
<td>4.80 GHz</td>
</tr>
</tbody>
</table>

\( ^\S \) c.: cycle(s), \( ^\dagger \) n/u: not used

**5.4.2. The Tri-Modulus Prescaler Principle**

The principle of the tri-modulus prescaler is based on the backward selection of \( \pi/4 \)-delayed signals. These signals are provided by the high-frequency fixed-modulus prescaler (left-hand block in the gray shaded area of Fig 5.12). The timing diagram which illustrates the three division modi is shown in Fig. 5.14. A detailed explanation is given hereafter:

- the divide-by-4 mode is obtained by letting the phase selector in the same state continuously and thus, is acting as as a “through”
function. The division ratio of the tri-modulus prescaler is equivalent to the division ratio of the high-frequency fixed modulus prescaler. This is illustrated in Fig. 5.14 by the arrow labeled $\div 4$;

- the divide-by-3.5 mode is realized by selecting a new input of the phase selector at each positive edge of its output in such a way that the new output signal leads by 45° the actual signal. This operation is achieved at each period. Arrows labeled $\div 3.5$ illustrate the phase selection in Fig. 5.14. Equivalently, a new positive edge appears at the output of the tri-modulus prescaler after three and a half VCO clock periods;

- similarly, the divide-by-3 division factor is obtained by selecting at each raising edges the 90°-shifted input in a way such that the new signal leads the current one by 90°. In this case, a new positive edge at the output of tri-modulus prescaler occurs every three VCO clock period ("$\div 3$" arrows).
5.4.3. High-Frequency Fixed-Modulus Prescaler

The high-frequency fixed-modulus prescaler (Fig. 5.15) is a full-speed divide-by-four prescaler based on two synchronous D-flip-flops (DFF), which have been realized in current mode logic (CML). This topology is required to drive the following stage, namely the phase selector, which requires eight signals accurately delayed by 45° at one fourth of the VCO frequency. In a synchronous topology, the DFF clock inputs are connected in parallel, thus requiring a VCO output buffer. Therefore, the limitation of this topology is the large capacitance load that is seen by the VCO output, and thus, to some extent, its maximum frequency of operation. The advantage is that synchronous dividers enable a direct input of the VCO signal on each latch, thus achieving a complete transition much faster. Moreover, since this high-frequency prescaler has a fixed modulus, it doesn’t require any internal logic gate slowing the signal’s edges and reducing the maximum frequency.

The advantage of this solution compared to a variable modulus prescaler (e.g. swallow counter topologies) is its simplicity and its reliability over a wide frequency range. The reliability of this divider is of first importance since the signal modulation relies on the PLL. For instance, when compared with a topology using the cascade connection of a divide-by-2 full-speed prescaler followed by two divide-by-2 half-speed dividers as described in [119], the topology proposed in this work doesn’t require any startup circuit to ensure that the eight outputs are correctly and constantly delayed by $\pi/4$.

Wideband Sensitivity Issues

Synchronous divider are feedback systems; the resistive output associated with the stray capacitance actually forms a relaxation oscillator. The input sensitivity, i.e. the minimum input signal required to obtain the required division ratio at the output of the divider, occurs at the potential frequency of self-oscillation of the prescaler. We usually chose this self oscillating frequency slightly above the maximum input frequency (typ. 5.5 GHz). Input frequencies above this frequency do not enable a proper division ratio due to the excessive internal delay around the loop formed by the prescaler’s flip-flops.

When used for multi-GHz or very wideband applications, some problems may appear if the VCO input signal frequency is much below the self-resonance frequency. The minimum input that is required to drive the prescaler increases and the output signal frequency may
result in a value that does not correspond to the desired division ratio of four.

In this work, a method to match the input sensitivity has been applied. It consists in modifying the self-resonance of the prescaler by modifying its bias in accordance with the tuning voltage of the VCO. This has been implemented to ensure that the prescaler delivers the required division ratios under any conditions of PVT (3σ and 0° to 80°, respectively). The bias current of the prescaler is simply varied between 70 and 100% of its nominal value over the entire range of the VCO control voltage. As depicted in Fig. 5.16, the resulting self-oscillating frequency is slightly shifted to 4.5 GHz when the VCO operates in lower end of the UWB bandwidth, which increases the sensitivity by more than twice towards lower frequencies.

5.4.4. The Phase-Rotator

The phase-rotator consists of two units: the phase-selector and its driver, also called phase-generator (see block schematic in Fig 5.12). The latter drives the phase-selector to enable the backward selection of the input phases of the fixed-modulus prescaler.
Figure 5.16. Post-layout simulations of prescaler sensitivity for nominal process condition and temperature \( T = 80^\circ \). The high-frequency curve (dashed) is the nominal prescaler sensitivity characteristic. The self oscillating frequency is set slightly above the maximum VCO frequency of 5.4 GHz (see Fig. 5.10). The low-frequency curve (continuous) represents the prescaler sensitivity characteristic for a VCO operating in the lower end of the UWB bandwidth (typ. around 3 GHz). It follows that the input sensitivity is more or less constant over the entire VCO frequency range and is approximately 120 mV between 3 and 5 GHz (dashed line).

Phase-Selection Circuit

The phase-selector is realized in CML logic and is depicted in Fig. 5.17. It consists of eight differential pairs (\( M_3, M_4 \)), whose bias is enabled by an AND-gate (\( M_{A1}-M_{A4} \)). The phase-selector chooses only one of the eight outputs of the fixed-modulus prescaler (signals \( \text{in}X^\circ \) and \( \text{in}X^\circ \)). Since the fixed-modulus prescaler provides only four physical outputs, each shifted by 45°, the four additional outputs are taken from the same physical outputs by inverting the polarity in order to cover 360°. The output of each differential pairs are connected together to a common differential PMOS load \( M_1-M_2 \). The AND section consists in two cascaded NMOS devices, whose gates are connected to the phase-selector driver generator. The latter circuit is responsible for the correct
Figure 5.17. Simplified schematic of the phase-selector. Inputs “inX°” come from the fixed-modulus prescaler, whereas inputs “QX” are provided by the phase-generator. Note that this picture does not illustrate the second phase-selector used for the generation of the quadrature signal “outQ” shown in Fig. 5.12. The schematic of the quadrature device is equivalent to the one depicted here, except at the inputs “inX°” which are fed by 90°-shifted output signal provided by the prescaler (i.e. “in0°” should read “in270°”, and so on).

phase switching in the phase-selector and is described in the next subsections.

**Divide-by-3/4 Medium Speed Prescaler**

The output frequency of the phase selector is further divided by a dual-modulus prescaler running at moderate speed ($\approx 0.9 - 1.2$ GHz input). The divider is depicted in Fig 5.18 (top grey-shaded box). The $\div 3$ or $\div 4$ division ratio is set by the logic signal “MODE”. The figure shows the equivalent block schematic with CMOS signals but the entire circuit is realized in CML logic. The output signal “DIV” is directly connected to the phase-frequency detector (PFD) of the PLL. The required duty cycles (Section 5.4.1) are obtained by the addition of the NAND gate before the output.
**Figure 5.18.** Simplified schematic of the moderate speed dual-modulus divider and the phase selector driver comprising the divider mode selector, the state memory latch and 8-phase generator.
Phase-Selector Driver

The three gray-shaded boxes at the bottom of Fig. 5.17 form the circuit driving the phase selector. This circuit has to provide the adequate signals to the AND gates of the phase selector to achieve the backward selection of the $45^\circ$-shifted output of the fixed-modulus prescaler.

The first block (divider mode selector) employs three multiplexers (MUX) and requires five logic signals (INV, C1, S1, C2, S2) in addition to MODE to set the entire divider in the required division mode. The Table 5.3 shows the value of each logic signal of Fig. 5.17 and the corresponding division ratio.

<table>
<thead>
<tr>
<th>$\div$</th>
<th>MODE</th>
<th>INV</th>
<th>S1</th>
<th>C1</th>
<th>S2</th>
<th>C2</th>
</tr>
</thead>
<tbody>
<tr>
<td>11</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>11.5</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>12</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>12.5</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>13</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>13.5</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>14</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>14.5</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>15</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>15.5</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>16</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>

The difficulty lies in the fact that, during BFSK modulation, the phase rotator has to change smoothly from one division ratio to another. In other words, during the process of changing the division ratio of the tri-modulus prescaler, no glitch in the output signal must appear. In order to obtain a smooth transition, the state of the selector before a change must be stored. This is realized by two D-latches in the driver of the phase rotator. These two latches actually retains the signal value at the output of the I and Q phase selectors until a change occurs at the MUXs output.

A timing diagram to illustrate the transition between two division ratios ($13 \rightarrow 14$) is given in Fig. 5.19. The top graph shows the digital input word configuring the divider (continuous line) and the effective division ratio at the divider output. The values of the digital input
Figure 5.19. Timing diagram of the frequency divider during the transition between two division ratios ($\div 13 \rightarrow \div 14$). See text for detailed explanations.
word configuring the divider are given in Table 5.3. For this set of
division modes, the \textbf{INV} signal has to be switched from low to high
(bold number in Table 5.3). This may result in a glitch in the \textbf{INV}
\textbf{MUX out} signal (top dashed circle in Fig. 5.19). The propagation of
this glitch through the 8-phase generator block (rightmost gray-shaded
area in Fig. 5.18) toward the driver output may cause a wrong phase
selection in the phase rotator. The addition of D-latches prevents the
propagation of this glitch (bottom dashed circle) and maintain the 8-
phase generator in a stable state during a change in the division ratio
(plain circles). From the simulations given in the top graph, we observe
that the division ratio changes smoothly in less than 3 ns (dashed line).

5.5. PLL Measurements

A PLL was fabricated separately to enable measurements. The chip mi-
crograph is shown in Fig. 5.20. The chip was equipped with differential
CML output buffers for the I and Q RF output signals. A wideband
opamp-based output buffer has been also added for monitoring the
VCO control voltage.

\begin{figure}[h]
\centering
\includegraphics[width=\textwidth]{PLL_chip_micrograph.png}
\caption{PLL chip micrograph. The overall dimension is 1.6 mm by
1.4 mm. The PLL core, including the loop filter, occupies an area of only
0.4 mm\times 0.2 mm}\end{figure}
5.5.1. PLL Dynamics for Tx: Frequency Modulation

The transmitter is based on the direct modulation of the PLL frequency, which is achieved by changing the division ratio, as shown in Table 5.3. The transitions between two frequencies must show a smooth behavior, meaning that, during these transient phases, the PLL does not leave its linear region of operation. The smooth transitions have been ensured by using latches in the phase generator to store the state of the phase selector during a change of the division ratio (Section 5.4.4). Figures 5.21 below show the measured response of the VCO control voltage $V_c$ to frequency steps of $\pm300$ MHz. We first observe that the frequency change in each channels A, B and C are smooth and no glitches or undesired transient phenomenons appear during a transition. This will enable a reliable BFSK modulation at the transmitter. The PLL exhibits a slightly underdamped response over most of the frequency range. The natural frequency is $f_n \approx 25$ MHz, which is well within the requirements ($f_n > 12$ MHz, as shown in Fig. 5.5). Figure a) and b) show dynamic behaviors with no change in the PLL loop configuration. We observe that, in Figure b), the response becomes more underdamped due to the lower VCO gain at higher frequencies. On the other hand, Figure c) shows the loop behavior with a slightly increased loop gain to avoid excessive underdamped step response. This is realized by changing the charge-pump current $I_{cp}$. Modifying the charge-pump current can be easily implemented by enabling or disabling current sources in the charge pump in accordance with the pair of digital words defining the division ratios $N, N + 1$ used for the BFSK modulation.

5.5.2. PLL for Rx LO Generation

Phase Noise

The phase noise has to be considered more particularly when the PLL operates as an LO synthesizer for the Rx mode. Since the pulse’s carrier frequency and energy are located well beyond the PLL bandwidth, the VCO noise dominates in the overall noise characteristic. As simulated in Fig. 3.35, the typical phase noise tolerated by our receiver is $-105$ dBC/Hz at a frequency offset of 150 MHz. The measured PLL phase noise has been reported in Fig. 5.22 for channel B ($f_{LO} = 4.05$ GHz). This picture actually shows a combined measurement of phase noise. Owing to the limited bandwidth of classical phase noise analyzers, the measurement could only be achieved up to 100 MHz (thin gray and dashed curves). Beyond 100 MHz, the measurement has
been realized with a classical spectrum analyzer (continuous curve). The latter curve also exhibits spurs at 300 MHz and 600 MHz offset. We observe that the measured phase noise stays below the mask. The latter corresponds to the phase noise power density used in simulations. The required noise profile is characterized by a power density of $-105$ dBc/Hz at 150 MHz offset and a corner frequency equivalent to a PLL bandwidth of 20 MHz.

5.6. Summary and Conclusions

In this Chapter, a PLL for a carrier-based IR-UWB transceiver has been realized and tested. This device is built around a single four-stage ring oscillator, which has been optimized to cover the entire UWB range between 3 and 5 GHz under process and temperature variations. The VCO exhibits sufficient but relaxed phase noise performance to meet the desired receiver specification ($-105$ dBc/Hz at 150 MHz offset). The PLL also features a frequency divider enabling fast and smooth transitions between division modes. The behavior of the PLL has been verified experimentally. Transitions between two carriers for BFSK modulation occurs in less than 80 ns with a worst case accuracy of 17 MHz (transition between $\div 15$ and $\div 16$). This accuracy is 2 MHz higher than the targeted value of $\pm 15$ MHz. Therefore accuracy at the transmitter can be traded-off with a better accuracy of the receiver’s LO since the latter is not modulated. The PLL draws 23 mA in continuous mode at a supply voltage of 1.8 V. A method to reduce power consumption in devices generating carrier with relaxed accuracy for IR-UWB transceivers is proposed in Appendix D.
Figure 5.21. VCO control voltage response to positive and negative 300 MHz frequency steps (unit steps in division ratio), as used by the BFSK modulation for channels A, B and C (thick lines). The thin dashed lines highlights the maximum frequency error $\Delta f$ after 80 ns.
Figure 5.22. Measured PLL phase noise.
6

Analog Front-End Receiver

6.1. Introduction

This chapter describes the realization of an entire IR-UWB analog front-end, whose purpose is the down-conversion and the amplification of the incoming BFSK-modulated RF pulses. The performance of the analog front-end is crucial for a reliable communication link, since this part is potentially the main source of signal degradation. In comparison to narrowband receivers, IR-UWB poses additional challenges for RF front-ends. The main constraint is the huge amount of available bandwidth, which makes the receiver vulnerable to interferers or other nearby IR-UWB transmitters. To cope with this issue, out-of-band filtering and front-end linearity requirements have already been investigated in Section 3.11. Additional solutions such as an integrated notch filter (Sections 6.2.3 and 6.4.2) will also be proposed to reduce harmonic distortion from 2.4 GHz interferers and relax the linearity constraints on the mixer.

The second difficulty lies in the fact that the receiver has to deal with pulsed signals. A potential threat with this kind of signals is the digital back-end of the receiver itself described in Chapter 7. Digital CMOS circuits may generate a pulse-like noise during the switching of static gates. These spikes are coupled through parasitic elements, power supply line [120] and substrate [121] to high impedance nodes of analog cells. Since the digital back-end’s main frequency is at or related to the pulse rate, pulsed noise may couple into the analog front-end and appears as a usable signal. This issue motivated the choice for a fully balanced topology, from the LNA to the analog input circuit of the
digital back-end. In addition, regulatory compliance measurements by the FCC have shown that this digital part can also emit spurious signal above the PSD levels of the prescribed mask. There is therefore an interest to use a low-complexity digital section to reduce the emission of such signals. A block schematic of the implemented analog front-end is given in Fig. 6.1.

The following aspects will be discussed in detail in this Chapter. First, we will investigate the effect and the limit of the bandwidth extension of a cascode LNA. We start from the well-known and proven cascode topology and extend the performance analysis to an amplifier providing a bandwidth larger than 1.5 GHz at a center frequency \( f_c \approx 4 \) GHz. Second, a low-power quadrature mixer will be described and the complete RF front-end is characterized. Finally, a detailed analysis and a realization of a variable gain amplifier (VGA) featuring a simple automatic gain control (AGC) functionality will be presented.

Contrary to the theoretical analysis detailed in Chapter 3, no channel filter has been implemented. In this prototype, the filtering function is realized by the intrinsic low-pass behavior of the mixer IF output and the VGA. Assuming that the corner frequency is close to 200 MHz, as analyzed in Fig. 3.30, this has a negligible effect on the demodulation performance. However, a low-pass filter with sharp roll-off characteristic is required in a multi-channel scenario where the adjacent channels have to be filtered out.
6.2. Low-Noise Amplifier (LNA)

The task of the LNA consist in transferring the available power from the antenna output port to the following blocks, while minimally degrading the SNR. The key specifications of an integrated LNA are gain, noise, bandwidth, linearity, matching (impedance, power and noise), power consumption and stability.

Wideband LNA’s can be implemented with several techniques such as the ones using feedback [122, 123] or distributed [124, 125] architectures. Although, the resistive shunt-feedback topology has been extensively used because of its superior broadband characteristic, this topology exhibits a trade-off between noise and bandwidth (and also input matching). A small feedback resistor between input and output increases the bandwidth, but also increases the noise figure. A higher bias current is thus required to obtain the desired gain and to further reduce the noise figure. On the other hand, distributed architectures are preferred solutions for multi-octave bandwidths and do not fit our application, especially in terms of power consumption and noise.

Simpler topologies using one or two transistor rely usually on common-gate or common-base transistor [126]. Unfortunately, low noise figure, wide bandwidth and high-gain performances come at the cost of a relatively large power consumption.

In sub-micron CMOS technologies, common-source topologies has proven to be the best choice in terms of power and noise for narrowband LNA operating well below the maximum transit frequency $f_{T,max}$ [127]. These solutions mainly aimed at applications with fractional bandwidth FBW in the order of 1-2%. The first question that one must answer is whether this topology fits the bandwidth requirement of our application? The goal of the following section is to investigate the extension of the fractional bandwidth of a common-source topology up to 40%, which corresponds to the frequency range between 3.3 and 4.8 GHz.

6.2.1. LNA Circuit Overview

The overall schematic of the proposed LNA is depicted in Fig. 6.2. The LNA uses a fully balanced topology with a inductively degenerated cascode structure. The LNA is biased by a current source $I_b$. The input is AC coupled via capacitors $C_d$. These capacitors also help in rejecting low frequency interferences. Inductors $L_g$ and $L_s$ determine the input impedance and are chosen such as to provide a 100 Ω dif-
ferential impedance. The RF pads are protected by ESD diodes. The input transistors $M_{1a,b}$ are biased by $V_{bg} \approx 1.1$ V. This value allows a sufficient voltage at the drain of the long transistor of the current source $I_b$ ($V_{d,Ib} \approx 0.6$ V) to provide unaltered common mode rejection. $M_{2a,b}$ are the cascode transistors and are directly biased to $V_{dd}$ through resistors $R_{b2}$. They provide isolation between the input transconductance $M_{1a,b}$ and the output I-to-V conversion stage built by the passive components $R_L$ and $L_L$.

A novel notch filter concept has also been included in this LNA. The series resonant network formed by $C_N$ and $L_N$ provides a low impedance path between the drains of $M_{1a}$ and $M_{1b}$. This prevents any signal located at or close to the chosen resonant frequency to be further amplified into a voltage at the LNA output. This filter has another interesting function: when choosing a notch frequency below the UWB band, the inductive behavior above the resonant frequency helps to cancel out the capacitive input of the cascode stage. This provides a kind of cascode input matching, which improves the current gain of the cascode stage in the case of wide cascode transistors, a feature which is highly desirable to reduce the noise figure of the LNA (see Section 6.2.3).

\begin{figure}[h]
\centering
\includegraphics[width=\textwidth]{lina_schematic.png}
\caption{LNA simplified schematic.}
\end{figure}
6.2. Low-Noise Amplifier (LNA)

6.2.2. Wideband Design Techniques

Gain Flatness Constraint on Q-factor

A first issue in the design of a wideband LNA is the gain flatness. It is known that an absolutely flat gain is not necessary over the entire UWB band [128]. However, in a channelized system, a system level gain-flatness requirement is still needed in such a manner that the gain has to be sufficiently flat at least within each of the communication channels.

The gain flatness is defined by the $-3$ dB corner frequencies of the bandpass transfer function of the LNA, i.e. $f_L = 3.3$ GHz and $f_H = 4.8$ GHz. The LNA gain flatness must be considered at the input matching network as well as at the output section. We share the flatness constraint equally between both input and output and assume that both associated equivalent circuits are second order resonant networks with identical Q-factors, i.e. $Q_{in} = Q_{out}$. Since both resonant circuits are cascaded, it can be shown with simple mathematics that the overall Q-factor of the LNA is

$$Q_{LNA} \approx 1.56 \cdot Q_{in}. \quad (6.1)$$

We calculate the maximum quality factors $Q_{in,max}^{[gain]}$ allowed by the gain flatness condition by assuming a $\pm 10\%$ tolerance on inductors and capacitors. Assuming a worst-case situation where variations of both input and output resonant networks are correlated (same sign), this is equivalent to a variation of the LNA’s geometric center frequency $f_c = \sqrt{f_L \cdot f_H}$ of $\pm 20\%$. This variation is simply added on the application bandwidth $B = f_H - f_L$ in the definition of the quality factor ($Q = f_c / B$) to obtain

$$Q_{in,max}^{[gain]} = \frac{Q_{LNA,max}}{1.56} \approx \frac{f_c}{1.56 \cdot (B + 0.2 \cdot f_c)} = 1.11. \quad (6.2)$$

This defines the maximum (loaded) quality factors at the input ($Q_{in,max}^{[gain]}$) and output ($Q_{out,max}^{[gain]}$) of the LNA to achieve the desired gain flatness. It is important to note that quality factors used in the following sections are always considered as “loaded” (by the resistive source or load), unless explicitly specified.

Impedance Matching Constraint on Q-factor

In a common source topology, the input impedance $Z_g$ at the gate of a MOS transistor is capacitive. A well-known method [129] to match
the input impedance to a purely real source impedance $R_G$ consists in degenerating the transistor source terminal by means of an inductance $L_s$. The equivalent half-circuit schematic of the inductively degenerated common source transistor is depicted in Fig 6.3-a; $Z_g$ is the impedance seen at the gate terminal. In the simplified small-signal schematic of Fig 6.3-b, the effects of the gate-drain capacitance $C_{gd}$ and the output conductance $g_d$ are first neglected for the sake of simplicity. The influence of these components will be investigated subsequently. With these simplifications, the resulting small-signal equivalent circuit leads to a gate impedance $Z_g$ that is expressed as

$$Z_g = \frac{v_{gs} + v_s}{i_g} = \frac{1}{sC_{gs}} + sL_s + \frac{g_m}{C_{gs}}L_s,$$  \hspace{1cm} (6.3)

where $v_{gs}$ and $v_s$ are the gate-source voltage and the source-ground voltage, respectively, and $i_g$ is the small-signal current flowing in the gate terminal. $g_m$ is the transconductance of the transistor and $C_{gs}$ is the gate-source capacitance. The interaction of $L_s$, $g_m$ and $C_{gs}$ generates a real part that can be written as

$$\Re\{Z_g\} = \frac{g_m}{C_{gs}}L_s = \omega_T L_s,$$  \hspace{1cm} (6.4)

where $\omega_T$ is the transistor’s transit frequency, i.e. the frequency at which the current gain is unity. $\omega_T$ depends mainly on the overdrive

Figure 6.3. a) Equivalent half-circuit schematic of the inductively degenerated common source transistor; b) equivalent small-signal circuit with neglected gate-drain capacitance $C_{gd}$ and output conductance $g_d$; c) equivalent resonant network for the impedance $Z_{in}$. 
6.2. Low-Noise Amplifier (LNA)

voltage $V_{GS} - V_{th}$. The gate impedance $Z_g$, defined in Eq. 6.3, still contains a capacitive term at center frequency $\omega_c$. This term can be resonated out by a series gate inductor $L_g$ such that to obtain a purely resistive input impedance at $\omega = \omega_c$. For that purpose, we define $Z_{in}(s)$ as the impedance of the series RLC resonator seen by the generator $v_G$ (see Fig. 6.3-c)

$$Z_{in}(s) = \frac{1}{sC_{gs}} + s(L_s + L_g) + \omega_T L_s + R_G. \quad (6.5)$$

The value of $L_g$ is obtained by

$$L_g = \frac{1}{\omega_c^2 C_{gs}} - L_s. \quad (6.6)$$

The resulting loaded Q-factor of the input network (including the internal generator resistance $R_G$) is simply calculated with the help of the classical definition of a series resonant circuit as

$$Q_{in} \triangleq \frac{1}{\omega_T L_s + R_G} \sqrt{\frac{L_s + L_g}{C_{gs}}}. \quad (6.7)$$

Under condition of impedance matching at resonance ($\omega_T L_s = R_G$), this equation reduces to

$$Q_{in}^{[\text{Zmatch}]} = \frac{1}{2R_G C_{gs}} \sqrt{C_{gs}(L_s + L_g)} \approx \frac{1}{\omega_c} = \frac{Q_g}{2} \quad (6.8)$$

where

$$Q_g \triangleq \left| \frac{\text{Im}\{Z_g\}}{\text{Re}\{Z_g\}} \right| = \frac{\omega_c L_s - \frac{1}{\omega_c C_{gs}}}{\omega_T L_s} \approx \frac{1}{\omega_c R_G C_{gs}} \quad (6.9)$$

is the unloaded quality factor at the gate of the inductively degenerated transistor for $\omega = \omega_c$ under matching conditions. The two last
approximations in Eq. 6.8 and Eq. 6.9 are allowed owing to the capacitive nature of $Z_g$ ($\omega_c L_s \ll (\omega_c C_{gs})^{-1}$). Since $Q_g$ includes all the parameters of the inductively degenerated common-source stage, it will serve later as a variable for the optimization of the circuit. Since $Q_g \approx 2Q_{in}$, the previous equation thus shows that the input bandwidth increases as the transistor size ($C_{gs}$) increases.

With simple mathematics, it can be shown that, to ensure an input reflection coefficient $|S_{11}| < 10$ dB on the operating frequency range defined by $f_L$ and $f_H$, the maximum quality factor $Q_{in,max}$ of this resonating network must be

$$Q_{in,max}^{[Z_{match}]} \approx \frac{f_{L,H}}{3 \cdot f_c \left|1 - \frac{f_{L,H}^2}{f_c^2}\right|},$$  (6.10)

where $f_{L,H}$ is either the lower or the upper frequency ($|S_{11}(f_L)| = |S_{11}(f_H)| = -10$ dB) and $f_c = \sqrt{f_L f_H}$ is the geometric center frequency. The previous approximation is obtained by replacing the expression of the LNA’s input impedance $Z_{in,LNA}(s) = Z_{in}(s) - R_G$ in the definition of the input reflection coefficient, i.e.

$$|S_{11}(s)| = \left|\frac{Z_{in,LNA}(s) - R_G}{Z_{in,LNA}(s) + R_G}\right| = \left|\frac{Z_{in}(s) - 2R_G}{Z_{in}(s)}\right|,$$  (6.11)

where $R_G$ is the source resistance and $Z_{in}(s)$ is defined in Eq. 6.5. For our application, where $f_L = 3.3$ GHz and $f_H = 4.8$ GHz, $Q_{in,max}^{[Z_{match}]}$ is equal to 0.9. When taking $\pm 20\%$ process variations into consideration, the maximum quality factor is reduced to $Q_{in,max}^{[Z_{match}]} \approx 0.5$.

We thus observe that, with regard to the input quality factor, the input matching condition, given by $Q_{in,max}^{[Z_{match}]}$ in the Eq. 6.10, is more constraining than the one imposed by the gain flatness ($Q_{in,max}^{[gain]}$ of Eq. 6.2). The difficulty for common source structures to achieve impedance matching over the entire UWB bandwidth has also been observed in [130]. In the latter reference, wideband matching is achieved by means of a multisection reactive network at the cost of a larger chip area.

The use of additional on-chip passive elements in front of the LNA, however, introduces additional losses, which degrades the overall noise figure. A preferred method is the use of the input network formed by the bonding wire and the pad of the circuit. The principle is shown in Fig. 6.4-a. The resulting bandwidth extension on the input matching is illustrated in Fig. 6.4-b.
6.2. Low-Noise Amplifier (LNA)

Figure 6.4. Wideband matching principle using pad ($C_p$) and bondwire’s equivalent inductance $L_b$. a) Equivalent schematic of the input network of an inductively degenerated common-source stage. b) The dotted curve shows $S_{11}$ for the classical narrowband matching principle ($Z_{in,LNA}$) and the plain curve shows the input reflection coefficient at the input of the T-network.

Influence of $C_{gd}$ on Z-Match and Q-factor

An often neglected aspect in inductively degenerated common-source stages is the influence of the gate drain capacitance $C_{gd}$ on the input quality factor. As shown in Fig. 6.5, a modification of the input impedance $Z_g$ occurs. This modification stems from the voltage gain across $C_{gd}$, which is referred to as the “Miller effect”. It can be shown that the (unloaded) quality factor $Q_g$ defined in Eq. 6.9 at the input
is transformed into the following expression

\[ Q'_g \approx Q_g \cdot \left[ 1 + M \alpha_C \left( 1 + \frac{1}{Q_g^2} \right) \right], \tag{6.12} \]

where \( M \) is the Miller factor, \( M = 1 - A_v = 1 + \frac{g_{m1}}{g_{m2}} \)\(^1\), and \( \alpha_C = \frac{C_{gd}}{C_{gs}} \) is a technological parameter\(^2\), which expresses the ratio of the gate-drain capacitance to the gate-source capacitance. The detailed calculation of \( Q'_g \) can be found in Appendix A.1.

The approximation in the previous equation is valid for moderate to highly capacitive \( Z_g \), which is the case for lightly degenerated common source transistor \( (1/\sqrt{L_s C_{gs}} \gg \omega_c) \). Typically, for the size of transistor considered in our application and with the help of the cascode topology, the input quality factor is doubled, which corresponds roughly to an input bandwidth reduction of a factor two.

\(^1\)we neglect the effect of the degeneration on the Miller factor \( M \).

\(^2\)\( \alpha_C \approx 0.35 \) for 0.18-\( \mu m \) N-type MOSFET transistor, independent of the bias in saturation.
6.2. Low-Noise Amplifier (LNA)

Degeneration Inductance with Miller Effect

The equivalent real part at the gate of common-source transistor is no longer equivalent to $\omega T L_s$ and is modified as

$$\Re\{Z'_g\} \approx \omega T L_s \frac{1 + Q_g^2}{1 + Q'_g^2}, \quad (6.13)$$

where $Q_g$ and $Q'_g$ are defined in Eq. 6.9 and Eq. 6.12, respectively. $Q_g$ contains the influence of the transistor size, whereas $Q'_g$ is related to the Miller effect $M\alpha_C$. The previous equation is however difficult to solve since both $Q$-factors also show a dependence on $L_s$. A good approximation of the required degeneration inductance with the Miller effect is given by the following equation

$$L_s \approx \frac{R_G}{\omega T} (1 + M\alpha_C)^2. \quad (6.14)$$

The accuracy of this approximation decreases for low quality factors but stays within $\pm 20\%$ of the exact value for $Q_{in} > 1$ ($Q'_g > 1/2$). The resulting equivalent reactance can be well approximated by a capacitance of value (Fig. 6.5-c)

$$C'_g = \frac{1}{|\omega_c \Im\{Z'_g\}|} \approx C_{gs} (1 + M\alpha_C). \quad (6.15)$$

From the values above, it is possible to define a new input quality factor $Q'_{in}$ taking the effect of the Miller capacitance in account,

$$Q'_{in} \approx \frac{1}{2\omega_c R_G C'_g}, \quad (6.16)$$

while the gate inductor is determined as

$$L_g \approx \frac{1}{\omega_C^2 C'_g}. \quad (6.17)$$

6.2.3. Cascode LNA Transconductance

The overall transconductance $G_{m,LNA}$ of the cascode LNA is determined from the small signal equivalent schematic depicted in Fig. 6.6. This equivalent schematic includes the effect of the gate-drain capacitance $C_{gd}$ and the WLAN notch filter $Z_N$. The LNA transconductance
is calculated in two steps. We first estimate the transconductance $G_m$, which corresponds to associated effect of the transconductance $g_{m1}$ of the common source stage and the input passive network ($R_G$, $L_g$, $L_s$ and $M1$). In a second step, the current gain $A_i$ of the cascode stage M2 is evaluated. The effect of the output conductance $g_{d1}$ of the inductively degenerated transistor M1 will be investigated later (see Eq. 6.41), but since this component is placed in parallel with the notch filter $Z_N$ and the gate source capacitance $C_{gs2}$, its effect is included during this second step. The overall transconductance is thus the product of the equivalent transconductance of the first stage $G_m$ multiplied by the current gain of the cascode stage $A_i$:

$$G_{m,LNA} = G_m \cdot A_i.$$  \hspace{1cm} (6.18)

\[ Figure \ 6.6. \ Small \ signal \ equivalent \ schematic \ of \ the \ cascode \ LNA \ with \ equivalent \ Miller \ capacitance \ \alpha C_{gs1} \ and \ WLAN \ notch \ filter \ Z_N. \]

**Input Stage Transconductance $G_m$**

We first calculate the transconductance of the input stage formed by degenerated transistor M1 and the input matching network (see Appendix A.2). We assume that the input is perfectly matched and $\omega = \omega_c$, the transconductance is then

$$G_m(\omega_c) \approx \frac{\omega_T}{j\omega_c} \cdot \frac{1}{1 + M\alpha C} \cdot \frac{1}{2R_G}. \hspace{1cm} (6.19)$$
6.2. Low-Noise Amplifier (LNA)

Cascode Current Gain

The effect of the cascode transistor M2 is modeled by a current transfer function $A_i$ and is equivalent to

$$A_i = \frac{i_{d2}}{i_{d1}} = \frac{-1}{1 + \frac{Y_{s2}}{g_{m2}}}, \quad (6.20)$$

where $Y_{s2}$ is the admittance at the source of the cascode transistor M2, which contains the output conductance $g_{d1}$ of M1, the gate-source capacitance of M2 and the notch filter $Z_N$ (see next subsection).

Integrated 2.45-GHz Notch Filter

Usually, interference-rejection bandpass pre-select filters (PSF) are inserted before the active part of the RF front-end to reduce the linearity requirements of the following stages and to avoid intermodulation products or desensitization of the receiver. This is done at the cost of a higher noise figure due to the insertion losses of the filter. In this RF front-end, we propose to insert a notch filter centered at 2.4 GHz directly after the first amplification stage. This notch filter does not completely eliminate the need for an off-chip PSF but helps in reducing one of the most threatening source of interference, i.e., WLAN and WPAN devices operating in the ISM2.4 frequency band. Among the potential interferers reported in Table 3.8, the ISM2.4 is the more likely to pose a major threat, as shown by the expected power levels of -28 dBm given in Fig. 3.38 and 3.39.

The notch filter consists of a series resistance $R_N$, inductance $L_N$ and capacitance $C_N$ building a series RLC-filter, which has been implemented at the drain of M1 (dashed rectangle in Fig. 6.6). Adding a notch filter at this stage prevents further amplification of the ISM2.4 frequencies through the LNA toward the mixer without significantly degrading the LNA’s noise figure. The role of this resonating filter is to provide a low-impedance path to AC-ground to reduce the gain of the LNA at this particular frequency. The impedance $Z_N$ of the notch filter is written as

$$Z_N = R_N + j \cdot (\omega L_N - \frac{1}{\omega C_N})$$

$$= R_N + j \cdot L_N(\omega - \frac{\omega_N^2}{\omega}), \quad (6.21)$$
Chapter 6: Analog Front-End Receiver

where \( \omega_N = 2\pi f_N = (L_N C_N)^{-\frac{1}{2}} \). At UWB center frequency \( \omega = \omega_c \approx 2\pi \cdot 4 \cdot 10^{-9} \), Eq. 6.21 can be rewritten as

\[
Z_N \approx R_N + j \cdot L_N \cdot 5\pi \cdot 10^9. \tag{6.22}
\]

The previous equations show that \( Z_N \) is inductive over the entire frequency range of our application between 3.3 and 4.8 GHz. This inductive behavior can be advantageously used to resonate out the gate-source capacitance \( C_{gs2} \) of the cascode transistor (peaking). There is a strong interest in having a large width cascode transistor to reduce the noise (see Section 6.2.4), but in classical implementations, increasing the width of the cascode transistors reduces the current gain \( A_i \) by introducing a pole at the source of M2 \( (Y_{s2} = j\omega C_{gs2} \text{ in Eq. 6.20}) \).

6.2.4. Noise Analysis

The BSIM3 model used in the BiCMOS7WL design kit does not adequately account for the actual noise generated in the MOS transistor. The main reason lies in the fact that the induced gate noise is not implemented in this model. Since this noise is proportional to \( f^2 \), it therefore becomes non-negligible at high frequencies. In this section, we will estimate the overall noise figure of the cascode topology for wideband designs. Additionally to the classical noise analysis considering channel thermal noise sources, both effects of the gate-drain capacitance (Miller effect) on the transfer functions and cascode noise will be treated in details. The equivalent schematic used for the LNA noise analysis is depicted in Fig. 6.7. The noise calculation will be considered under perfect impedance match at the input, which means that the different noise sources and phenomena considered in the circuit are calculated with a source resistance of 50 \( \Omega \). A list of the noise mechanisms considered in the cascode LNA is given hereafter:

1. \( e_{nG} \): noise due to the source;
2. \( i_{nch1} \): channel thermal noise of M1;
3. \( i_{ng1} \): induced gate noise of M1;
4. \( c \): correlation effect between \( i_{nch1} \) and \( i_{ng1} \);
5. \( i_{nch2} \): channel thermal noise of M2;
6. \( i_{ng2} \): induced gate noise of M2.
The overall output noise can be calculated by considering the different noise source and their corresponding contribution at the drain of M2. A generic equation for the noise current at the drain of M2 can be written as

\[
\bar{i}_{n,d}^2 = \frac{e_n^2}{G_{m,LNA}} \cdot G_{m,LNA}^2 + \frac{N_1(i_{n1},i_{ng1})^2}{i_{n1,d1}} \cdot A_i^2 + \frac{N_2(i_{nch2},i_{ng2})^2}{i_{n1,d2}},
\]

(Eq. 6.24)

where \(N_k\) is the noise transfer function, through which the equivalent noise at the drain of transistor \(M_k\) is obtained. The expression below braces defines noise contribution of transistors referred to the corresponding drain. Noise expressions that will be detailed later in this section are given in parenthesis.

\[
\bar{i}_{n1,d2}^2 = \frac{e_n^2}{G_{m,LNA}}^2 + \frac{N_1(i_{n1},i_{ng1})^2}{i_{n1,d1}} \cdot A_i^2 + \frac{N_2(i_{nch2},i_{ng2})^2}{i_{n1,d2}}.
\]

Figure 6.7. Cascode LNA noise calculation overview. The gray shaded current sources are the noise sources considered in this calculation. The gray arrows indicate the correlation between the channel thermal noise and the induced gate currents of M1 and M2.

Noise Due to the Source

The estimation of the noise figure requires the calculation of the contribution of the source to the overall output noise. This will be calculated...
with the help of Fig. 6.6, where the source voltage $V_G$ is replaced by a noise source $\hat{e}_n^2 = 4k_B T R_G \Delta f$. From the expression of the overall transconductance $G_{m,LNA} = G_m \cdot A_i$ from Eq. 6.19 and Eq. 6.20, the equivalent noise current at the drain of M2 can be expressed as

$$
\hat{i}_{nG,d2}^2 = G_{m,LNA}^2 \cdot \hat{e}_n^2 \cdot \left( \frac{1}{1 + M \alpha_C} \right)^2 \cdot \frac{1}{(1 + \frac{Y_{s2}}{g_{m2}})^2} \cdot \frac{4kT \Delta f}{R_G},
$$

(6.24)

where $\alpha_C = C_{gd1}/C_{gs1} \approx 0.35$, $M = 1 + g_{m1}/g_{m2}$ and $Y_{S2} \approx g_{d1}/2$, which corresponds to the condition when the notch filter inductance is canceling out the capacitance at drain of M1. $\eta$ is the mismatch factor calculated as

$$
\eta = \frac{1}{1 + \frac{\omega_T L_s}{R_G(1 + M \alpha_C)^2}} \approx \frac{1}{2} \text{ (at impedance match)}
$$

(6.25)

where $L_s$ is the degeneration inductor with Miller effect given in Eq. 6.14.

### Noise Due to the Common Source Stage

The noise of the common source stage will be analyzed similarly as in the work of Shaeffer [131–133]. This work provides a methodology for noise analysis including the effect of the induced gate noise. The resulting expressions enable the optimization of the noise figure with respect to bandwidth and power consumption.

### Channel Thermal Noise

We first derive the contribution of the channel thermal noise source $\hat{i}_{nch1}^2$ to the output of the inductively degenerated common-source stage formed by $M1$ and $L_s$. The equivalent schematic for the calculation of the noise is depicted in Fig. 6.8.

To obtain a simplified analytical expression, we neglect the gate-drain capacitance $C_{gd}$ since the noise source appears at the transistor drain and the impedance toward the source of the cascode transistor
6.2. Low-Noise Amplifier (LNA)

Figure 6.8. Equivalent circuit schematic for the calculation of the channel thermal noise contribution $i_{nch,1,d1}$ of the inductively degenerated common-source stage.

$M2$ is smaller than the one seen in the direction of the gate of $M1^3$. The expression in the case of a wideband application can be approximated as

$$i_{nch,1,d1}(\omega) \approx i_{nch1} \cdot \frac{1}{1 + \frac{\omega_T L_s}{R_S} \cdot \frac{1}{(1 + jxQ_g)}}, \quad (6.26)$$

where $L_s$ is the value of the degeneration inductance without Miller effect and must satisfy Eq. 6.4; $Q_g$ is the quality factor formed by the source resistance and the input capacitance of the transistor $M1$ and is defined by Eq. 6.9. The variable $x = x(\omega)$ is the detuning factor and is equal to $(\omega/\omega_c - \omega_c/\omega)$, where $\omega_c$ is the geometric center frequency. For low quality factors as required in this design ($Q_{in} \approx 1$), $x \cdot Q_{in} \ll 1$ and, over the frequency band of interest, the above is still well approximated over the bandwidth of interest by

$$i_{nch,1,d1}(\omega) \approx i_{nch1} \cdot \eta, \quad (6.27)$$

The error on the transfer function $i_{nch,1,d1}(\omega)/i_{nch1}$ is less than 2 dB over the entire bandwidth.

---

3The impedance seen by the drain of M1 towards the source of the cascode transistor M2 is $Z_{\rightarrow s2} \approx 1/gm2$, whereas the impedance seen in the direction of its gate is at least $|Z_{\rightarrow g1}| = 1/(\omega_Cg_m)$ (we neglect the Miller effect). Thus $|Z_{\rightarrow g1}| = \omega_T/(\omega_C\alpha Cgm1)$; for $g_m1 = g_m2$, $Z_{\rightarrow g1}$ is at least $\omega_T/(\omega_C\alpha C)$ larger than $Z_{\rightarrow s2}$. 
Chapter 6: Analog Front-End Receiver

Channel Thermal Noise Model

The current noise spectral density associated with $i_{\text{inch}}$ is

$$S_{i_{\text{inch}}} = \frac{i_{\text{inch}}^2}{\Delta f} = 4kT\gamma_{\text{long}}g_{d0} \quad (6.28)$$

For a long-channel transistor in saturation ($L > 2 \, \mu m$), the channel thermal noise excess factor $\gamma_{\text{long}}$ is approximately $2/3$. This value increases for shorter transistors due to the increased lateral field in the channel inducing hot electrons effect and velocity saturation [134]. The BSIM3 model describing transistors of the BiCMO7WL technology actually does not account for this effect and is based on the long-channel approximation. In the hand-calculation, we will use the thermal noise model developed by Klein [135]. This model introduced an improved analytical expression for the channel thermal noise valid for deep submicron MOSFET’s with channel lengths smaller than $2 \, \mu m$. The Klein’s model of the thermal channel noise density is

$$S_{i_{\text{inch}}} = 4kBT\gamma_{\text{long}}g_{d0} + \frac{8}{3}q\nu_{\text{sat}} \frac{I_{ds}}{L} \tau_e = 4kT\gamma g_{d0} \quad (6.29)$$

where $g_{d0}$ is the drain-source conductance for $V_{ds} \ll V_{gs} - V_{th}$ (triode region) and is a decreasing function of the channel length $L$, $k_B = 1.38 \cdot 10^{-23}$ is the Boltzman constant, $q = 1.6 \cdot 10^{-19}$ is the elementary charge, $\nu_{\text{sat}} = 1.36 \cdot 10^5$ is the saturation velocity of the carrier (value obtained from the transistor model) and $\tau_e \approx 10^{-12}$ is a fitting parameter. The factor $\gamma$ in the right-hand expression is the equivalent short-channel channel noise excess factor. As shown in Figure 6.9 for a channel length $L = 0.18 \mu m$, the channel noise is 6-7 times larger than the long-channel approximation, i.e. $\gamma \approx 4$. Figure 6.9 also confirms that the noise of the BSIM3 model is based on the long-channel approximation.

Induced Gate Noise Model

The thermal agitation of charges in the channel gives rise to another noise phenomenon: the induced gate noise. The fluctuating channel potential couples capacitively into the gate terminal, leading to a noisy gate current. This excess noise can be modeled by a noise current source $i_{\text{ng1}}$ in parallel with $C_{gs}$ (Fig. 6.10) and expressed as

$$\bar{i}_{\text{ng1}}^2 = 4kBT\delta g_{g} \Delta f, \quad (6.30)$$
Figure 6.9. MOSFET channel thermal noise density as a function of the channel length \( L \). A value of \( 2/3 \) (long-channel approximation) fails to model short-channel effects by almost one order of magnitude for \( L = 0.18 \mu m \).

where

\[
g_g \approx \frac{(\omega C_{gs})^2}{5g_{do}}, \quad (6.31)
\]

\[
\delta \approx 2 \cdot \gamma. \quad (6.32)
\]

Equation 6.31 shows the frequency dependence of the induced gate noise, whereas the noise excess factor \( \delta \) can be well approximated by taking twice the thermal channel noise excess factor [134].

Induced Gate Noise of a Common-Source Stage

The contribution to the current noise output of M1 of the induced gate noise is calculated from the equivalent schematic depicted in Fig. 6.10 and results in a much more complex equation due to the inductively degenerated source. The calculation consists in determining how much noise from the noise current source \( i_{ng1} \) appears at the drain of \( M1 \). Some simplifications are possible owing to the same reason as in the
previous subsection. It can be shown that, within the 1.5 dB bandwidth, the output noise at the drain of an inductively degenerated common source stage due to the equivalent noise source \( i_{n1} \) can be approximated as

\[
i_{n1,d1} = i_{n1} \cdot g_{m1} \cdot Z_{ng},
\]

where \( Z_{ng} = \frac{v_{gs1}}{i_{n1}} \) is the effective transimpedance converting the current noise into a voltage at the gate-source terminal of the transistor \( M1 \):

\[
Z_{ng} \approx \frac{\omega_c Q'_in[R_G + j\omega(L_g + L_s)]}{j\omega(1 + jxQ'_in)}.
\]

\( Q'_in \) is the quality factor at the transistor’s gate including the source resistance (Eq. 6.16), \( \omega_c \) is the resonance frequency seen by the induced gate noise current source and is equivalent to the center frequency \( \sqrt{\omega_L\omega_H} \) and \( x \) is the detuning factor.

**Total Noise at Common-Source Drain** When calculating the spectral density of the total noise at the drain of \( M1 \), we get

\[
\overline{i_{n,d1}^2} = i_{n,d1}^* \cdot i_{n,d1} \cdot i_{n1,d1}
\]

\[
= (i_{nch1,d1}^* + i_{ng,d1}^*)(i_{nch1,d1} + i_{ng1,d1}).
\]
By replacing $i_{nch1,d1}$ and $i_{ng1,d1}$ by the expression obtained in Eq. 6.27 and Eq. 6.33, respectively, we obtain

$$i_{n,d1}^2 = \left( \eta i_{nch1}^* + i_{ng1}^* Z_{ng} g_{m1}^* \right) \left( \eta i_{nch1} + i_{ng1} Z_{ng} g_{m1} \right)$$

$$= \eta^2 i_{nch1}^2 + \ldots$$

$$2 \cdot \Re \left\{ \frac{i_{nch1}^* i_{ng1}^* \cdot g_{m1} Z_{ng} \eta}{\sqrt{i_{nch1}^2 i_{ng1}^2}} + g_{m1}^2 |Z_{ng}|^2 i_{ng1}^2 \right\} \triangleq c \sqrt{i_{nch1}^2 i_{ng1}^2}$$

$$= i_{nch1}^2 \left[ \eta^2 + 2 \Re \left\{ c \sqrt{i_{ng1}^2 g_{m1} Z_{ng} \eta} \right\} + \delta \eta_{nch1}^2 |g_{m1} Z_{ng}|^2 \right]$$

(6.35)

where $c$ is the correlation coefficient between $i_{ch1}$ and $i_{ng1}$ and is defined as

$$c = \frac{i_{nch1}^* i_{ng1}^*}{\sqrt{i_{nch1}^2 i_{ng1}^2}} \approx -j \cdot 0.39 \quad \text{for long transistors [134]). (6.36)}$$

This correlation comes form the fact that both sources have the same origins and are partially correlated. The correlation coefficient is considered as a purely imaginary number owing to the capacitive coupling between the channel and the gate. The negative sign comes from the current sources polarity of the equivalent circuit depicted in Fig. 6.10.

In general, $c$ is frequency independent and decreases when the channel length decreases. Values of $c \approx -j \cdot 0.2$ has been measured [136] for 0.18 $\mu$m MOSFET’s in saturation. In Eq. 6.35, the first term in brackets is related to the channel thermal noise, the second and third terms correspond to the correlated and the uncorrelated part of the induced gate noise, respectively.

By replacing in the above equation the noise sources by their definition (Eq. 6.29-6.30), i.e.

$$\frac{i_{ng1}^2}{i_{nch1}^2} = \frac{4kT \delta \omega^2 C_{gs1}^2}{5g_{do1}} = \delta \frac{\omega^2 C_{gs1}^2}{5g_{do1}} = \frac{2 \omega^2 C_{gs1}^2}{5g_{do1}},$$

(6.37)

and by identifying $\kappa = g_{m1}/g_{do1}$, the total output noise current at
Chapter 6: Analog Front-End Receiver

the drain of common-source transistor M1 for \( \omega = \omega_c \) becomes\(^4\)

\[
\overline{i_{n,d1}^2(\omega_c)} = \frac{i_{nch1}^2}{2} \left[ 1 - 2|c| \sqrt{\frac{2\kappa^2}{5}} + \frac{2\kappa^2}{5} \left( \frac{Q_{in}^2}{\eta^2(1 + M\alpha C)^2} + 1 \right) \right]
\]

(6.38)

where \( \overline{i_{nch1}^2} = 4kT\gamma gd_0\Delta f \). In saturation, \( \kappa = g_m/gd_0 \) decreases with higher bias current densities or overdrive voltages since

\[
gd_0 \approx \mu_n C_{ox} \frac{W}{L} (V_{gs} - V_{th}) \\
\approx \frac{I_{ds}}{V_{ds}} \approx W \cdot \frac{J_{ds}}{V_{ds}}
\]

increases and

\[
g_m \approx \mu_n C_{ox}/2 \cdot W E_{sat}
\]

stays relatively constant due to velocity saturation (Fig. 6.11)\(^5\). Therefore, the noise at the drain of a degenerated common source transistor decreases with increasing current densities. Note also that common source stages employing shorter transistors suffer from increased output noise due to a smaller correlated induced gate current (smaller \(|c|\)).

The last term of Eq. 6.38, which represents the uncorrelated part of the induced gate noise, dominates the noise mechanism of the commonsource stage. This noise becomes much more important at low current densities (remember that the minimum transistor width - \( C_{gs} \) - is dictated by the input bandwidth, which leads to a defined \( Q_{in}' \)) and the desired minimum noise is directly related to the bias current. It is interesting to note that the uncorrelated part of the induced gate noise is reduced for increased Miller factor \( M \). Unfortunately, the noise contribution of the source (Eq. 6.24) depends on this parameter, as well, and the resulting noise figure remains unaffected by an increase of \( M \).

\(^4\)At the resonance \((\omega = \omega_c)\) and under matching conditions, the term \( g_m Z_{ng} \) in Eq. 6.35 can be calculated as

\[
g_m Z_{ng}(\omega_c) = \frac{\omega_T}{\omega_c(1 + M\alpha C)} \left( Q_{in} - \frac{j}{2} \right).
\]

\(^5\)In a long-channel transistor \( g_m \approx \sqrt{2\mu_n C_{ox} \frac{W}{L} I_{ds}} \) and is consequently proportional to \( \sqrt{J_{ds}} \).
6.2. Low-Noise Amplifier (LNA)

![Figure 6.11](image_url)

**Figure 6.11.** $g_m$ and $g_{d0}$ of short-channel transistors ($L = 0.18 \, \mu m$) saturation as a function of the channel current density $J_{ds}$.

**Cascode Noise**

In a cascode topology, the noise contribution from the common-gate transistor is usually neglected. This statement is true only if the impedance seen from the source of the CG is very large, as it will be confirmed later in Eq. 6.40. In our realization, this impedance is formed by the output impedance $Z_{o1}$ of the degenerated common-source transistor M1 (which cannot considered as infinite) in parallel with the impedance of the notch filter $Z_N$. This arrangement requires a particular attention. The equivalent small-signal circuit is shown in Fig. 6.12. We neglect the channel-length modulation of the common-gate transistor M2 since $g_d^{-1} \gg R_L$. At the source terminal, the induced gate noise current source $i_{ng2}$ appears in parallel with the channel thermal

![Figure 6.12](image_url)

**Figure 6.12.** Equivalent schematic for the calculation of channel thermal noise due to the common gate transistor.
noise $i_{nch2}$. In this hand-calculation, we neglect the effect of the small correlation coefficient $c_2$ (short-channel transistor) and we simply add the noise power densities of each of the noise sources\(^6\). After a few calculations, the resulting equivalent noise current source $i_{ns2}^2$ at the source terminal of transistor $M2$ can be expressed as

$$i_{ns2}^2 \approx \frac{i_{nch2}^2}{1 + \kappa^2 \omega^2 T} \approx i_{nch2}^2.$$  

(6.39)

As a consequence for common-gate transistors operating at frequencies $\omega$ much smaller than the transit frequency $\omega_T$, $\kappa^2 \ll \omega^2 T / \omega^2$ and the induced gate noise can be neglected. Another interpretation of this result is that the common-gain transistor $M2$ does not exhibit any current gain between its gate and its drain terminals. Therefore, such a topology does not amplify the induced gate noise, as it is the case for common-source topologies. The thermal noise at the drain of $M2$ can thus be written as

$$i_{nch2,d2}^2 \approx \frac{i_{nch2}^2}{(1 + g_{m2} \cdot Z_{s2})^2} = \frac{4kT \gamma g_{do2}}{(1 + g_{m2} \cdot Z_{s2})^2}.$$  

(6.40)

The term $Z_{s2}$ refers to the impedance seen at the source of the common-gate transistor $M2$ and corresponds to the parallel connection of the output impedance $Z_{o1}$ of the inductively degenerated common-source transistor $M1$, the gate-source capacitance $C_{gs2}$ of $M2$ and the notch filter impedance $Z_N$ (Section 6.2.3). The output impedance of the inductively degenerated common-source transistor $M1$ around the center frequency $\omega_c$ can be approximated by

$$Z_{o1} = \frac{1}{\eta \cdot g_{d1}} \left(= \frac{2}{g_{d1}}\right),$$  

(6.41)

where $g_{d1}$ is the output conductance of the common-source transistor $M1$, $\eta$ is the mismatch factor introduced in Eq. 6.27. As seen in Section 6.2.3, the notch filter can be chosen such that it almost cancels out $C_{gs2}$, the gate-source capacitance of the cascode transistor. Since the equivalent parallel resistance due to the quality factor of the inductance $L_N$ is much larger than the output resistance of the transistor $M1$, $Z_{s2}$ can be simplified to

$$Z_{s2} \approx Z_{o1}.$$  

(6.42)

\(^6\)i.e., the polarity of both noise current does not play a role.
6.2. Low-Noise Amplifier (LNA)

The second benefit of the notch filter becomes apparent here. The inductive impedance of $Z_N$ within the bandwidth of interest is resonating out the gate-source capacitance of the cascode transistor. Since the latter form a parallel-resonant network with $Z_N$, the impedance seen at the source of $M2$ is higher, which tends to reduce the contribution of the channel thermal noise of $M2$ to the output. Finally, the noise seen at the drain of the transistor $M2$ can be obtained by replacing Eq. 6.42 in Eq. 6.40

$$\frac{\overline{v^2_{nch2,d2}}}{i_{nch2,d2}} \approx \frac{4kT\gamma g_{do2}}{(1 + \frac{g_{m2}}{\eta g_{d1}})^2}.$$  \hfill (6.43)

At the denominator of the previous equation, we recognize the ratio of the transconductance of $M1$ and output conductance of $M1$. As seen in Fig. 6.11, the transconductance of a short-channel transistor in saturation is relatively constant due to velocity saturation effects and is proportional to the width $W$.

The output conductance $g_d$ ($\neq g_{do}$) is extracted from the transistor model and compared with $g_m$ in Fig. 6.13. The resulting ratio $K_g = g_m/g_d$, which is a technological parameter, is approximately 15-20 for current densities $J_{ds} < 100 \ \mu A/\mu m$ and decreases for higher values of $J_{ds}$ due to velocity saturations effects on $g_m$.\hfill (6.44)

In Eq. 6.43, $g_{m2}$ (transistor $M2$) and $g_{d1}$ (transistor $M1$) do not refer to the same device; since transistors $M1$ and $M2$ are biased with the same current, one can apply the square root of the width ratio to obtain the following approximation (we neglect the bulk effect difference between both transistor, which slightly reduces $K_g$)

$$\frac{g_{m2}}{g_{d1}} \approx K_g \sqrt{\frac{W_2}{W_1}}.$$  \hfill (6.44)

\hfill (6.44)

Note that this ratio is approximately 50 for long channel transistors.
At the numerator of Eq. 6.43, we also rewrite \( g_{d02} \) as a function of \( g_{d01} \) so as to replace \( 4kT\gamma g_{do} \), the thermal channel noise of the common-gate transistor, by \( \frac{i^2}{i_{nch1}} \). The definition of \( g_d \), the drain-source conductance at zero \( V_{ds} \), gives

\[
g_{d02} = \mu_n C_{ox} \frac{W_2}{L_2} V_{ov2} = g_{d01} \frac{W_2 V_{ov2}}{W_1 V_{ov1}}, \quad (6.45)
\]

where \( V_{ov} = V_{gs} - V_{th} \) is the overdrive voltage of the transistors. For fixed bias current and transistor length \( L \), the overdrive voltage of a transistor in saturation is proportional to \( \frac{1}{\sqrt{W}} \) and \( g_{d02} \) can be written as

\[
g_{d02} = g_{d01} \sqrt{\frac{W_2}{W_1}}. \quad (6.46)
\]

Therefore, we can simplify Eq. 6.43 as

\[
\frac{i^2}{i_{nch2,d2}} \approx \frac{i^2}{i_{nch1}} \cdot \frac{\sqrt{\frac{W_2}{W_1}}}{\left(1 + \frac{K_g}{\eta} \sqrt{\frac{W_2}{W_1}}\right)^2} \approx \frac{i^2}{i_{nch1}} \cdot \frac{\eta^2}{K_g^2} \cdot \sqrt{\frac{W_1}{W_2}}. \quad (6.47)
\]

The relatively large value of \( K_g \), even for short channel transistors below velocity saturation, explains why common-gate transistor noise is usually neglected in a cascode LNA. However, in order to achieve gain over large bandwidth with minimum current consumption, a large common-source transistor \( M1 \) is desired (\( g_m \) must be large to avoid excessively large degeneration inductances \( L_s \)). A large common-source transistor \( M1 \) however increases the contribution of the common-gate transistor \( M2 \) to the total output due to the term \( 1/g_{d1} \) at the denominator of Eq. 6.43. The latter appears as the dependence on \( W_1 \) in Eq. 6.47. The factor \( W_2 \) at the denominator means that large common-gate transistors are preferable for low noise. However, in classical topologies (i.e. without notch), \( W_2 \) has two influences that counteract each other. First, the term \( W_2 \) in the above expression comes from the transconductance \( g_{m2} \). The latter is dictating the amount of channel thermal noise. The side effect of large widths is the reduction of the impedance seen at the source of \( M2 \) due to an increased source-drain capacitance, which increases the noise at the drain. In our realization, the latter noise contribution has been reduced with the inductive behavior of the notch filter in the UWB band. \( M2 \) can be thus chosen with a larger width. The upper limit is owing to a pole that reduces the current gain \( A_i \) (see Eq. 6.20).
6.2. Low-Noise Amplifier (LNA)  

Noise Contribution of the Resistive Load

The last contribution to the overall output noise of the LNA comes from the load $R_L$ (Fig. 6.2). Since the quality factor at the output is limited for bandwidth reason, a resistive element is placed in series with the resonating load inductor $L_L$. This resistance decreases the quality factor of the on-chip inductor, which forms an equivalent parallel resonant circuit with the capacitive input of the mixer. This series resistive element also introduces some noise which directly adds to the output of the LNA. A polysilicon on-chip resistor is preferred since the noise contribution of a PMOS transistor used in triode mode is much larger.

The required resistive load is calculated as

$$R_L \approx \frac{Q_{\text{out}}}{\omega_c L_L},$$  \hspace{1cm} (6.48)

where $L_L$ is chosen such as to resonate with the total output capacitance $C_{\text{out}}$ at $\omega = \omega_c$. The noise contribution of the load resistor is thus

$$\frac{i_{nL,d2}^2}{R_L} = \frac{4kT\Delta f}{R_L}. \hspace{1cm} (6.49)$$

Part of this noise is absorbed in the output capacitance of M2 and does not directly appears at the input terminal of the mixer. In this calculation, we neglect this noise, since it contributes only a small fraction of a dB to the overall noise figure (for $R_L \approx 200\Omega$, $\frac{i_{nL,d2}^2}{R_L}$ is smaller by one order of magnitude than the induced gate noise).

LNA Noise Figure

Neglecting the noise due to the cascode transistor $\frac{i_{nch2,d2}^2}{i_{nG,d2}}$ and the noise from the load $\frac{i_{nL,d2}^2}{i_{nG,d2}}$, the corresponding noise factor $F(\omega)$ can be approximated as follows

$$F(\omega) = \frac{i_{nG,d2}^2 + i_{ndg1,d1}^2}{i_{nG,d2}^2} \cdot A_i^2,$$ \hspace{1cm} (6.50)

where $\frac{i_{nG,d2}^2}{i_{nG,d2}}$ is the output noise due to the source (Eq. 6.24), $\frac{i_{ndg1,d1}^2}{i_{nG,d2}}$ is the noise at the common-source drain (Eq. 6.38) and $A_i$ is the common-gate current transfer function. Finally, by adding the contribution of the series resistive elements at the gate of the common-source stage, the noise factor at center frequency $\omega_c$ can be written as
\[ F(\omega_c) \approx 1 + \frac{R_{Lg} + R_g}{R_S} + \frac{\gamma \omega_c}{\kappa \omega_T} \cdot \frac{1}{2Q_{in}} \cdot (1 + M\alpha_C)^2 \cdot \ldots \]
\[
\left[ 1 - 2|c|\sqrt{\frac{\kappa^2\delta}{5\gamma}} + \frac{\kappa^2\delta}{5\gamma} \left( \frac{Q_{in}^2}{\eta^2(1 + M\alpha_C)^2} + 1 \right) \right],
\]

where \( R_g \) is the gate resistance and \( R_{Lg} \) is the series resistance of the gate inductor \( L_g \). \( R_{Lg} \) equals \( \omega_c L_g / Q_L \), \( Q_L \) being the quality factor of the on-chip inductors. The gate resistance \( R_g \), for gate fingers contacted at both ends, is defined as

\[
R_g = \frac{R_\square W}{12n^2L},
\]

where \( R_\square \) is the polysilicon sheet resistance \((8\Omega/\square, [115])\) and \( n \) is the number of fingers.

**Discussion**

The size of the transistor \( M1 \) (common-source) is fixed by the required input matching (Eq. 6.10) and input bandwidth, through the quality factor \( Q'_{in} \) (Eq. 6.12). The latter also fixes the amount of amplification that occurs at the gate-source terminal due to the resonance of the input network. Increasing the width of the common-source transistor \( M1 \) to increase \( g_{m1} \) is accompanied by a reduction of the Q-factor at the input and does not improve the overall gain of the LNA. A way to increase the gain of the LNA is to increase the Q-factor at the load by reducing the equivalent series resistance \( R_L \). Unfortunately, the maximal quality factor at the load also sets an upper value for \( R_L \) and gain can only be obtained at the cost of a higher bias current, which determine the transistor’s transit frequency \( \omega_T \) in Eq. 6.19.

As we will later see in the Section 6.2.6, the noise is dominated by the induced gate noise. This may appear surprising since the involved input Q-factor are low (in the order of one) and \( \frac{i^2_{ng1}}{i_{nch1}} \) is almost two order of magnitude smaller than the channel thermal noise \( i^2_{nch1} \). The main reason explaining the dominating induced gate noise is the transimpedance \( Z_{ng} \) seen by the induced gate noise source \( i_{ng1} \) (Fig. 6.10). The latter transimpedance is also a resonating transimpedance (Eq. 6.34) with Q-factor \( Q'_{in} \) but has a peak value that
lies in the order of $40 \text{ dB-} \Omega$, thus bringing the induced gate noise to a level comparable to the thermal channel noise at the drain of $M_1$.

### 6.2.5. Layout Issues

**Single-ended vs. Differential**

The main reason for using a differential topology is the common mode rejection and the second order linearity. Any noise on the power supply line appears as a common-mode signal and is therefore ideally not seen at the output. Such a topology has been also used in the LNA since the IR-UWB deals essentially with short pulses. Incoming parasitic pulses at the front-end are amplified and may be considered as a valid signal in digital baseband. This may occur especially at the initial signal acquisition during which an estimation of the SNR is carried out (see Section 7.3).

The main drawback of the differential topology is that it requires twice the power for the same theoretic performance. In the overall power budget of the receiver, this increase in power consumption has been considered as acceptable with regard to the better immunity it offers against pulsed noise.

### 6.2.6. Simulation and Measurements

The LNA circuit has been separately implemented and measured on a test substrate. For the measurements, the nominal supply voltage of 1.8 V and a bias current for the LNA core of 4.8 mA have been used (i.e. differential topology using $2 \times 2.4 \text{ mA}$). On the test chip, a wideband buffer (not shown) has been added to drive the 50-\Omega input impedance of the measurement instruments. This wideband buffer also presents a similar load capacitance as the subsequent mixer, but increases the noise figure by about 0.5 dB.

**S-parameters**

Figure 6.14 depicts the measured differential gain $|S_{\text{dd}}^{21}|$ is compared with post-layout simulations. A peak gain of 23.5 dB is measured around 4.2 GHz, while 19 dB and 21 dB are observed at the edge frequencies (3.3 and 4.8 GHz), respectively. The gain variation within each of the band A, B and C stays smaller than $\pm 1 \text{ dB}$. The LNA has a bandpass characteristic that helps to reduce the influence of the out-of-band interferers (Section 3.11). At 900 MHz (mobile phone interferers),
a rejection of more than 20 dB is measured. The integrated notch (see p. 189) is located around 2.45 GHz and provides an additional 10 dB rejection to reduce the interfering narrowband WLAN signals. When compared to the minimum inband gain, the total rejection at this frequency reaches 17 dB.

The reverse isolation $S_{21}$ is not illustrated but stays smaller than -45 dB over the entire bandwidth. This high isolation limits the amount of local oscillator power that may leak to the antenna and gets partially reflected back to the mixer through the LNA. This effect causes self-mixing issues, which is a critical problem in zero-IF receivers. A good reverse isolation also limits the amount of power that is radiated to the nearby receivers tuned on the same frequency. The last benefit of a good reverse isolation is that it ensures stability.

Figure 6.15 shows the measured input matching of the LNA along with the post-layout simulation. The measured $|S_{11}^{dd}|$ is better than -10 dB (> 90% of power transfer from the test equipment to the LNA) over the application bandwidth between 3.3 and 4.8 GHz. The discrepancy above 4.5 GHz is caused by a higher than expected parasitic inductance at the input port caused by the wirebonds. However, this parasitic inductance can be advantageously used to extend the band-
width of the input matching above the application range. This feature is desired when a preselect filter is used. In order to provide the adequate transfer function, passive preselect filters must be loaded with a specific output impedance (typ. 50Ω). In our case, the performance of an additional filter to reject the 5-6 GHz WLAN bands is guaranteed by a near-50Ω impedance at these frequencies.

**Noise Figure Measurements**

The measured noise figure over the application bandwidth is given in Fig. 6.16 for the chosen bias current of 4.8 mA. The half-circuit equivalent bias current is therefore 2.4 mA. Assuming an input quality factor of \( Q_{in, meas} \approx 0.91 \), as depicted in Fig. 6.15, the calculated noise figure is 4.1 dB (Eq. 6.51). The dot marker in Fig. 6.16 includes additional 0.5 dB contribution from the output buffer. As expected, the BSIM post-layout simulations (dashed curve) gives too optimistic results by approximately 1.5 dB. This roughly corresponds to the contribution of the induced gate noise that is not taken into account in the BSIM3 model of the transistor.

Under the same condition and assuming an output quality factor at the load of \( Q_{out} \approx Q'_{in} \), the LNA gain is calculated as 16 dB. In the designed LNA, the output Q-factor has been chosen twice the input factor to provide some more gain. Thus, the overall Q-factor of the LNA is almost completely determined by the output Q-factor and is \( Q_{LNA} \approx 2.5 \), as shown in Fig. 6.14. By choosing a doubled output Q-factor, the calculated peak gain becomes 22 dB, which is close to the value obtained by measurements.

**Linearity**

The LNA does not entirely determine the front-end linearity, which is also influenced by the mixer. The front-end linearity will be investigated later in Section 6.4.2 on p. 221 and has to meet the specification formulated in Section 3.11. In this section, measurements on the LNA’s linearity are given for the sake of completeness.

**IIP3 and ICP_{1dB}** To measure the linearity of a RF device, the two-tone test is commonly used, where one of the tone represents the desired signal and the other represents an adjacent channel interferer. The two-tone test for third-order intermodulation distortion and the measured 1-dB input compression point (ICP_{1dB}) are shown in Fig. 6.17. For
The measured input quality factor is $Q_{in,\text{meas}} \approx 0.91$; this value has been extracted from the measured $S_{11}$ parameters (dot markers) with the help of Eq. 6.10.
6.2. Low-Noise Amplifier (LNA)

Figure 6.16. Measured noise figure of the cascode UWB LNA. The hand-calculation model includes additional 0.5 dB introduced by the output buffer used for measurement purpose.

For our application, the two-tone test is performed in the middle of the UWB band (channel B) with tones placed at frequencies $f_{c,B} \pm f_o$, i.e. $f_{\text{tone}1} = 3.9$ GHz and $f_{\text{tone}2} = 4.2$ GHz. The measured intermodulation products are located in adjacent channels A and C, at 3.6 and 4.5 GHz, respectively. Both intermodulation products show similar amplitude and result in an input-referred third-order intercept point (IIP3) of -13.4 dBm. The IIP3 can also be estimated from

$$IIP3 = \frac{3}{2}P_{out} - \frac{1}{2}P_{IM3} - G = P_{in} - \frac{\Delta P_3}{2}, \quad (6.53)$$

where $P_{in}$ is the input power of one tone, $P_{out}$ the output power, $P_{IM3}$ the power of the intermodulation products at the output and $G$ the power gain. $\Delta P_3$ is the difference between the input power and the intermodulation product at the output (with values given in dB).

Since the required IIP3 for the entire front-end is -22 dBm (see Fig. 3.39), this value leaves some margin for the design of the mixer. The ICP_{1dB} has been measured at -25.8 dBm, which is 12.4 dB below the IIP3.
**Figure 6.17.** Measured third-order nonlinearity of the cascode UWB LNA. The frequency of the tones are 3.9 and 4.2 GHz (channel B) and generate intermodulation products in channel A and C.

**IIP2** The second-order intermodulation distortion is rarely considered in LNA’s. In our application, this is of importance, since large bandwidths come into play. The effect of distortion has been determined by means of two measurements. The first one simply consists in measuring the second order distortion of an incoming in-band signal (here at a frequency of 3.6 GHz) to obtain an estimation of the second-order intercept point (IIP2). The measured second harmonic (at 7.2 GHz) is simply reported in Fig. 6.18. To account for the nonflat LNA gain, the gain difference between 3.6 and 7.2 GHz has been added to the measured amplitude of the second harmonic. Note that this method gives a worst case estimation of the linearity since the gain difference does not completely take place after the generation of the second harmonics. In Fig. 6.18, the markers represent the measured values of the fundamental and second-harmonic frequencies, whereas the solid lines are linear regressions of low level measurement, which result in a crossing point defining the intercept point. The measured IIP2 is +17 dBm. Under the same conditions, simulations showed an IIP2 of +30 dBm with a 3σ variation of 15 dB due to process mismatch. We thus observe that the obtained result stays within the simulated range. However, the hybrid device that has been used to generate the differential signal also
introduces a mismatch that may cause an additional decrease in the measured linearity.

![Figure 6.18. Measured second-order nonlinearity of the cascode LNA.](image)

The second method considers a realistic scenario, for which the interference signal at 2.4 GHz mimics the effect of a narrowband WLAN interferer. In this case, the second-order harmonic falls exactly at 4.8 GHz, which is the upper offset frequency of channel C, and is downconverted to the baseband (it corresponds to the rightmost interferer illustrated in Fig. 3.38). The equivalent input-referred second-order harmonic of this signal is reported in the Fig. 6.18 by the dashed curve as a function of the input power at 2.4 GHz. We observe that the input power of the interferer should not exceed -30 dBm to result in an inband interferer smaller than -83 dBm. This limit has been fixed in Section 3.11 to enable a reliable communication at the maximum sensitivity. Assuming a PSF providing an additional attenuation of 21 dB, this input power corresponds to a WLAN interferer roughly located at 10 cm.

### 6.2.7. Summary and Conclusions

The 0.18-μm wideband CMOS LNA described in this section offers gain of more than 19 dB for a noise figure smaller than 6 dB between 3.3 and 4.8 GHz, while consuming 4.8 mA (8.6 mW at nominal supply...
voltage of 1.8 V). It uses a cascode topology and features a notch filter centered at 2.4 GHz, which helps to reduce interfering signals from nearby WLAN and Bluetooth transmitters. A detailed noise analysis has been performed. This analysis reveals that the noise performance is dominated by the induced gate noise, which is not taken into account in the BSIM3 model of the transistors.

A summary of the LNA performances is given in Table 6.1 and compared with recently reported 0.18-$\mu$m CMOS LNA’s targeted for lower UWB band between 3 and 5 GHz. The realized LNA shows excellent performances in terms of gain and power consumption for a slightly less competitive noise figure. This higher gain, however, helps to reduce the noise of the subsequent stages, thus enabling lower power consumption for the entire analog RF front-end. Note also that this low power consumption is obtained with a fully differential topology. Such a topology is expected to show additional noise immunity when using the LNA in a receiver system with digital baseband.

6.3. Down-Conversion Mixers

6.3.1. Introduction

Many mixer topologies have been published for decades and they can be classified into three main families:

- **Passive mixers:** Passive mixers are the most common and simplest down-conversion circuits. They offer higher linearity at the cost of lower conversion gain (maximum theoretical gain of $2/\pi \rightarrow -3.9$ dB, but rather -5 to -10 dB in practical implementations) and poor noise figure. Moreover these devices require large LO amplitude compared to active mixers. Since the last stages in a receiver chain are predominant for IIPx performances, they are usually more suitable for the second down-conversion of double-conversion heterodyne architectures. Typical passive mixers are based on diode or MOSFET rings;

- **Active mixers:** The term generally applies to any mixer which is powered by more than the LO drive signal and which uses devices in amplification modes, so as to provide conversion gain as well as low noise and high overall linearity. This is the preferred type of mixer for our applications since it provides flexibility and allows an excellent optimization.
Table 6.1. LNA performance summary and comparison with recently reported 0.18-µm CMOS LNA’s targeted for lower UWB band between 3 and 5 GHz.

<table>
<thead>
<tr>
<th>Topology and Details</th>
<th>BW [GHz]</th>
<th>Gain [dB]</th>
<th>NF [dB]</th>
<th>IIP3 [dBm]</th>
<th>$P_{\text{supply}}$</th>
<th>$P_{\text{supply}}$</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>This work</strong></td>
<td>3.3-4.8</td>
<td>&gt;19</td>
<td>4.5-6.2</td>
<td>-13.4 @ 4 GHz</td>
<td><strong>8.6 mW @ 1.8 V</strong></td>
<td>Fully differential cascode with WLAN notch filter</td>
</tr>
<tr>
<td>[130]</td>
<td>3.1 - 10.6</td>
<td>9.3†</td>
<td>4.2 - 5</td>
<td>-6.7 @ 6 GHz</td>
<td>9mW @ 1.8 V</td>
<td>Single-ended cascode with Chebyshev input network</td>
</tr>
<tr>
<td>[137]</td>
<td>2 - 8.5</td>
<td>13</td>
<td>4.1 - 4.8</td>
<td>-13.5 @ 3 GHz</td>
<td>9.3 mW @ 1.8 V</td>
<td>Single-ended common-gate with local feedback and peaking</td>
</tr>
<tr>
<td>[138]</td>
<td>3 - 5</td>
<td>13</td>
<td>4.6 - 5</td>
<td>0.1 @ 4 GHz</td>
<td>14.6 mW @ 1.8 V</td>
<td>Single-ended cascode with shunt feedback and LC-ladder filter</td>
</tr>
<tr>
<td>[139]</td>
<td>3 - 6</td>
<td>13.7†</td>
<td>3.6 - 5</td>
<td>-5 @ 4 GHz</td>
<td>12.5 mW @ 1.8 V</td>
<td>Single-ended cascode with feedback loop</td>
</tr>
<tr>
<td>[140]</td>
<td>3 - 6.5</td>
<td>16</td>
<td>1.9 - 3.4</td>
<td>-13 @ 4 GHz</td>
<td>4.5 mW @ 1.06 V</td>
<td>Single-ended with self-forward-body-bias</td>
</tr>
<tr>
<td>[141]</td>
<td>2.4 - 6</td>
<td>17.5 - 18.2</td>
<td>3.5 - 5</td>
<td>-13</td>
<td>39 mW @ 1.8 V</td>
<td>Three-stage pseudo-differential</td>
</tr>
</tbody>
</table>

† power gain
• **Subsampling mixer**: This kind of mixer resembles that of a digital down-conversion but does not fit pulse-based signalling with very low duty cycles. This type of mixer would require many RF pulses to downconvert and reconstruct a pulse at baseband and therefore will reduce the achievable data rate.

### 6.3.2. Implemented Mixer

The receiver front-end employs two mixers for quadrature signal demodulation. In Fig. 6.19, the simplified equivalent schematic of only one device is depicted for the sake of clarity. It uses an active double-balanced topology, which is usually preferred at the design stage for their superior performance offered by the fully differential circuit. This topology achieves a better immunity to substrate and supply noise. The noise immunity is very important in IR-UWB receivers working with pulsed signals, since any pulsed noise (typically the noise which may be coupled from the digital baseband) may appear as a signal in single-ended topologies. Another advantage of the double-balanced mixer is the smaller LO leakage into the IF output; switch transistors with opposite output polarity (i.e., M8-M10 and M9-M11) are connected together and consequently tend to cancel the amount of LO signal at their outputs.

The RF input of the mixer circuit makes use of the balanced signals provided by the differential LNA described in Section 6.2. The RF signal voltage is converted into a current by a pseudo-differential pair (source grounded pair) built by transistor M1 and M2. The differential RF current $i_{RF}$ is switched by the MOSFET transistor quad (M8-M11) between either load resistors $R_L$ [142]. The polarity of the resulting output signal changes at the LO frequency, which results in signal components at $|f_{RF} \pm f_{LO}|$. The high frequency component is removed by the low-pass filter formed by $R_L$ and the capacitive load at the mixer output, whereas the low frequency component represents the desired signal to be further processed.

### Current Stealing

In this type of mixer, the amount of biasing current required by the transconductance and the LO switches are in contradiction with each other. For the transconductance to achieve high linearity and gain, a large gate-source overdrive voltage is required for M1-M2, leading to a large DC biasing current. On the other hand, we will later see in
6.3. Down-Conversion Mixers

![Simplified schematic of the mixer. This device uses an active double-balanced topology and a pseudo-differential input transconductance stage. The common mode regulation is achieved by means of a replica circuit and an OTA.](image)

Fig. 6.19. Simplified schematic of the mixer. This device uses an active double-balanced topology and a pseudo-differential input transconductance stage. The common mode regulation is achieved by means of a replica circuit and an OTA.

Fig. 6.20 that a fairly low current is required for the LO switching quad to obtain optimum performance. A current stealing technique is employed in the design by adding two PMOS-based DC current sources (M5-M6) at the drain of M1-M2 to lower the biasing current through the switching quad. Therefore, the circuit linearity is not degraded since the biasing current of the transconductance circuit can be fixed independently. An additional advantage is that the load resistance of the mixer can be increased to obtain higher conversion gain. Current stealing is also known as “charge-injection” [143], “current-steering” or “current-bleeding” [144].

In the following subsections, the different stages of the mixer are described in details. Topics such as conversion gain, noise and non-linearity are also discussed for the considered parts.

Transconductance Section

The transconductance stage M1-M2 provides a voltage-to-current conversion with a gain $g_m$, which, together with the load $R_L$, fixes the
overall gain of the mixer. High mixer gains are achieved with high $g_m$, which usually require higher bias currents. The mixer conversion gain is expressed as

$$A_{v,MIX} = \frac{V_{IFout}}{V_{RFin}} = g_m A_{sw} R_L,$$

(6.54)

where $A_{sw}$ is the gain of the switch section (transistors M8-M11) investigated in p. 217.

The proposed mixer uses a pseudo-differential transconductance. The advantage of such a topology is twofold. First, the linearity of a pseudo-differential pair is increased and second, this topology saves up the saturation voltage required by the tail current source of a true differential pair. The price to pay is actually a decrease in noise immunity due to a worse common mode gain. Common-mode noise may convert into differential-mode noise when mismatch occurs in the mixer layout. The latter can however be minimized by a careful design (large transistors to avoid mismatch) and improved common-mode feedback mechanism.

The linearity of the pseudo-differential pair is increased owing to the fact that the dominant second-order nonlinearity in a MOSFET circuit cancels out in differential signal. Assuming a pseudo differential stage similar to the one formed by M1-M2 in Fig 6.19 and using MOSFETs satisfying long-channel I-V characteristic driven by a signal $V_{RFin} = V_{cm} \pm \frac{\Delta v(t)}{2}$, the differential current output is

$$\Delta i(t) = i_+(t) - i_-(t) = \beta \left[ \left( V_{cm} + \frac{\Delta v(t)}{2} - V_{th} \right)^2 - \left( V_{cm} - \frac{\Delta v(t)}{2} - V_{th} \right)^2 \right] = 2\beta \cdot (V_{cm} - V_{th}) \cdot \Delta v(t),$$

(6.55)

where $\beta = \mu_n C_{ox} W/L$ is the MOSFET gain, $V_{th}$ is the threshold voltage and $V_{cm}$ is the common mode voltage at the RF input of the mixer. From the equation above, we can approximate the transconductance $g_m$ of the pseudo-differential pair as

$$g_m = \frac{\Delta i(t)}{\Delta v(t)} \approx 2\mu_n C_{ox} \frac{W}{L} (V_{cm} - V_{th}),$$

(6.56)

($V_{cm} - V_{th}$) is the overdrive voltage biasing the pseudo-differential pair and is generated by a the diode connected transistor $M0$. Below velocity saturation, the transconductance can also be expressed as
We observe from the Eq. 6.55 that the second-order order nonlinearity is ideally removed and the output differential current depends linearly on $\Delta v$. The third-order nonlinearity is caused by large input signal swings exceeding the threshold voltage $V_{th}$, which turns off one of the transistors of the pair.

The noise contribution of this stage can be analyzed in a way similar to the input stage of the LNA, given in Section 6.2.4. From a system point of view, the noise figure of a mixer must not overwhelm the gain of the preceding LNA, which is larger than 19 dB (Section 6.2.6). Noise figures of less than 15 dB for the mixer are easily achievable and therefore have a negligible contribution in the receiver chain. This can be realized with fully integrated implementation, where the output impedance of the LNA is close to the optimum source impedance (typically a few hundreds of Ohms) required by the mixer for minimum noise figure.

**LO Switch Section**

The switching stage is mainly responsible for a loss in the overall gain chain of a mixer. The loss is dependent on the shape and the amplitude of the LO signal and the overdrive voltage $V_{ov,sw}$ of the switching pair. For an ideal rectangular LO signal with an amplitude larger than $V_{ov,sw}$, the switching pair gain is $A_{sw} = 2/\pi$, i.e., -3.9 dB [143]. For a sinusoidal LO signal, the $A_{sw}$ is proportional to the LO amplitude for $V_{LO} < V_{ov,sw}$ and, for LO amplitudes larger than the overdrive voltage, is clamped to a point which is slightly less than $2/\pi$ due to non-square signal and parasitic loss. The overdrive voltage dictates the requirements for the LO signal amplitude. For a long channel transistor, the overdrive voltage $V_{ov,sw,long}$ is proportional to the square root of the drain current and therefore, small currents (or large W/L) is preferred for small LO input signal. Taking velocity saturation effect $E_{sat}$ into account, the effective overdrive voltage is increased and can be written as:

$$V_{ov,sw} = \sqrt{\frac{I_d}{\mu_n C_{ox} W L} \left( \frac{V_{ov,sw,long}}{2LE_{sat}} + \sqrt{1 + \frac{V_{ov,sw,long}^2}{4LE_{sat}}} \right)} \cdot (6.57)$$

Although a larger LO drive can provide a higher $A_{sw}$, excessively large LO may degrades the conversion gain. This is mainly due to the gener-
ation of even order harmonics that are coupled into the common source of the differential switch pair, which is connected to the drain of the RF input transistors.

The size of the switching transistors results from a trade-off between the overdrive voltage (a large W/L decreases the overdrive voltage) and the amount of capacitive load that the mixer input presents at the LO buffer. Therefore, for a given bias current of the switching transistor $I_{b,sw}$ and a given LO buffer output impedance, an optimum width exists for the switching transistors.

To investigate this optimum in the $[W_{sw}, I_{b,sw}]$ design space, we assume that the LO buffers have an output impedance of half a kΩ and the LO signal has an amplitude of approximately 500 mV (unloaded). The optimum width $W_{sw}$ and bias $I_{b,sw}$ of the switching transistor have been determined by transient simulations in Fig. 6.20.

![Figure 6.20. Optimization of the LO switch transistors (size $W_{sw}$ and bias $I_{b,sw}$) in terms of current transfer function $A_{sw}$ for minimum length transistor ($L = 180$ nm). The maximum transfer function is identified by a small square marker ($A_{sw,\text{max}} = -4.3$ dB). The chosen parameters ($W_{sw} = 24 \mu m$ and $I_{b,sw} = 0.1$ mA) are identified by a round marker (see text for explanations).](image)

The contours depict the current transfer function $A_{sw}$ of the switching stage. We observe a maximum close to the theoretical limit of -3.9 dB identified by a black square marker at -4.3 dB. We defined a
close to the optimum area delimited by the -5 dB contour. We also identify two regions, where the gain drastically reduces. The first in the upper left corner corresponds to excessively wide transistors, which results in larger input capacitance. This region is also characterized by smaller bias currents and hence imposes large load resistance $R_L$ to maintain gain and constant desired output common mode. Both conditions in the upper right corner of the plot results in the attenuation of the down-converted signal due the a reduced low-pass cutoff frequency at the mixer output. The second area in the bottom right corner of the plot corresponds to a excessively large overdrive voltage, which prevents a proper commutation of the switching transistors.

The chosen parameter switch transistor width ($W_{sw} = 24 \, \mu m$) and bias ($I_{b,sw} = 0.1 \, mA$) are identified by a round marker. There is an interest to choose wider than needed transistors to reduce the mismatch that may result in common-mode to differential-mode conversion at the output of the mixer. On the other hand, choosing a halved bias current ($I_{b,sw} \approx 0.1 \, mA$) slightly reduces $A_{sw}$ by 1.7 dB but allows to increase the load resistor $R_L$ value by a factor two ($= 6$ dB). This results in a net increase of the conversion gain $A_{v,mix}$ of $6 - 1.7 = 4.3$ dB without any additional bias current for the transconductance stage. The upper limit for the load resistor value $R_L$ is fixed by the load and the parasitic capacitances ($\approx 150 \, fF$), which, placed in parallel with $R_L$ ($\approx 4.8 \, k\Omega$), has to provide an output pole larger than the baseband signal bandwidth ($\approx 200 \, MHz$).

**Output Common Mode Regulation**

The common-mode regulation circuit is depicted on the right-hand side of Fig. 6.19. It is based on an OTA and a 1:4 replica of the mixer built by transistors M4, M7 and M12. To obtain a better matching of the replica circuit, the gate of the latter transistor is biased by the common-mode voltage of the LO signal, which is approximately 1.2 V. This way, the drain voltages of transistors M1-M2 and their influence on the output DC voltage are also accounted for. The common-mode regulation loop makes use of the transistor $M_7$, the replica of the current stealing transistor, to set the output common mode of the mixer. Therefore, no additional circuitry is needed on the mixer itself.
6.4. RF Front-End Characterization

A separate chip including both the LNA and the I/Q-mixers has been implemented for test purposes.

6.4.1. Gain, Input Matching and Noise Figure

Measurements of the differential gain $A_v$, input reflection coefficient $S_{11}$ and overall noise figure $NF$ in frequency domain between 2 and 6 GHz of the IR-UWB RF front-end are shown in Fig. 6.21.

![RF front-end measurements in the frequency domain](image)

**Figure 6.21.** RF front-end measurements in the frequency domain. The peak voltage gain reaches 23 dB and the minimum noise figure is 6.5 dB. The RF input shows a good input reflection better than -8 dB over the entire bandwidth above 3.3 GHz. The vertical dashed lines locates the modulation frequencies $f_{LO} \pm f_o$ for the different channels A, B and C.

The LNA-mixer circuit achieves a maximum voltage gain $A_v$ of 23 dB in channel B and a minimum of 19.5 dB in channel C. The -3 dB corner frequencies (round markers) are located at 3 GHz and 4.7 GHz. In each channel, the gain variation is smaller than $\pm 1$ dB. In this front-end implementation, the peak gain of the LNA has been reduced by approximately 6 dB down to 18 dB, when compared to the LNA realization presented in Section 6.2.6. This enables an easier ful-
filment of the linearity targets (see next subsection). This lower gain has been obtained by decreasing the resistors $R_L$ at the LNA output (Fig. 6.2), which also contributes to flatten the gain, making the overall gain characteristic less prone to process variations over the desired bandwidth. The simulated gain of the mixer (with the output buffer) is close to 5 dB. The front-end also shows a notch between 2 GHz and 2.45 GHz. The rejection at this frequency is more that 10 dB below the front-end peak gain and does not exhibit any dependence on the supply voltage. An overall noise figure between 6.4 dB (channel B) and 8 dB in channel C is achieved. This is 1.5 dB higher than the LNA noise figure alone. This can be explained by the lowered LNA gain and the additional noise induced by the mixer and the output buffer implemented for measurement purposes. The results of the gain and the noise figure have been obtained for a output frequency of $f_{IF} = f_o = 150$ MHz, a LO amplitude of -2 dBm ($V_{LO} \approx 250$ mV) and a capacitive load at the IF output of approximately 2 pF (high frequency probe of a digital sampling scope).

In the same Fig. 6.21, a plot of the measured differential input S-parameter $S_{11,d}$ is given. This result is very close the one obtained in Fig. 6.15, since the integrated LNA is biased with the same current and feature an identical input topology. The RF front-end draws 8 mA ($I_{LNA} = 5$ mA, $I_{mixers} = 3$ mA).

6.4.2. Linearity

Intermodulation IIP2 and IIP3

The inband linearity of the front-end has been evaluated with a two-tone excitation. In a front-end featuring down-conversion, this type of signal enables the identification of intermodulation products of second and third order, which, with the correct choice of frequency value for the tones, can be located within the receiver bandwidth. These inband components play a dominant role in bandpass systems, as they constitute the main source of nonlinearity. As seen in Section 3.11, our polynomial model of Eq. 3.87 is excited by a two-tone signal

$$x(t) = A_1 \cos (2\pi f_1 t) + A_2 \cos (2\pi f_2 t). \quad (6.58)$$

For our test and in order to obtain worst case results, we chose the two-tone frequencies in channel B ($f_{LO} = 4.05$ GHz), i.e. where the LNA gain is at its maximum. By choosing $f_1 = 4.2$ GHz and
$f_2 = 4.17$ GHz, we obtain inband second- and third-order intermodulation products located at $|(f_1 - f_{LO}) - (f_2 - f_{LO})| = 30$ MHz and $|2 \cdot (f_1 - f_{LO}) - (f_2 - f_{LO})| = 90$ MHz, respectively. The amplitudes of these components are reported in Fig. 6.22 as a function of the amplitude of the two-tone components ($A_1 = A_2$, expressed by their equivalent power $P_{in}$ in dBm and given on the x-axis). By extrapolating the 1-dB/dB, the 2-dB/dB and the 3-dB/dB slope lines of the fundamental, the second- and the third-order output power, respectively, we obtain the second- and third-order input intercept points $IIP_2$ and $IIP_3$. The extracted values for $IIP_2$ and $IIP_3$ are $+19$ dBm and $-18$ dBm respectively. The -1 dB compression point is $-33$ dBm.

\[ \begin{align*}
CP_{1dB} &= -33 \text{ dBm} \\
IIP_2 &= +19 \text{ dBm} \\
IIP_3 &= -18 \text{ dBm}
\end{align*} \]

**Figure 6.22.** RF front-end linearity measurements. The RF front-end fits the requirements formulated in Section 3.11. At higher input power ($P_{in} > -35$ dBm), an increase of the slope of the IMP can be observed. This suggests a higher power order contribution to the overall nonlinearity of the front-end.

**Harmonic Distortions HD2**

We have seen in Section 3.11 that the second harmonic distortion of a narrowband interferer occurring in the ISM2.4 band is one of the most threatening signal for our application. This interferer comes from a single device and is therefore not resulting from two signals, as it is the case for intermodulation products. When combined with other
interferers to build intermodulation products, the ISM2.4 signal is also involved and is responsible for the strongest IMP (see the four rightmost interferers in Fig. 3.38).

In this section, we measured and compared with simulations the input-referred second-order harmonic distortion $P_{HD2}$. Assuming a continuous wave located at 2.4 GHz with a power $P_{ISM2.4} \approx -30$ dBm impinging at the LNA input (which is approximately the expected power of a WLAN emitting at +20 dBm at a distance of 10 cm), the down-converted output signal at 150 MHz has been measured at approximately $V_{IF} = 950 \mu V$. Taking into account the 20 dB voltage gain of the front-end, this corresponds to an equivalent input referred power at 4.8 GHz of -76 dBm. The worst case simulation over 20 Monte-Carlo runs is -71 dBm, whereas the average is around -92 dBm ($3 \sigma \approx 21$ dB). The measured value of -76 dBm is thus within the expected range, but 7 dB above the maximal value of -83 dBm defined in Fig. 3.38. This however represents a worst case condition, additional out-of-band attenuation can be obtained with a more selective antenna (an equivalent quality factor of 2 has been used for the antenna in simulations) or by simply reducing the LNA gain at the cost of slight increase in noise figure. Simulations show that the same circuit under the same conditions but without the notch filter exhibit an input referred harmonic power $P_{HD2} = -61$ dBm, which is 15 dB higher than with the notch filtered LNA. This is less than two times the additional attenuation brought by the notch filter (approximately 10 dB, see Fig. 6.14), since the mixer itself also contributes to the generation of a second-order harmonic.

### 6.4.3. Test Chip

A photomicrograph of the chip is shown in Fig. 6.23. The test chip is pad limited but the active part of the circuit occupies 0.5x0.7 mm$^2$. The three pairs of inductor are belonging (from left to right) to the input matching, the degeneration and the LNA output, respectively. The inductor in the center is part of the ISM2.4G notch filter. The mixers are inductorless. For test the chip to be able to drive the capacitive input of an oscilloscope, output buffers have been added.
Figure 6.23. RF front-end chip micrograph.

6.5. The Variable Gain Amplifier (VGA)

6.5.1. Specifications

The proposed VGA must be able to compensate for signal amplitude variations caused by receiver-transmitter distances varying between 3 cm and 10 m. Assuming a free space environment with a propagation exponent of two, the gain varies over a range of $10 \log_{10} [(10/0.03)^2] \approx 50$ dB. At 10 m, the typical peak signal amplitude $V_{rf}$ at the receiver front-end is about one hundred microvolts. Thus, to obtain a signal in the order of hundreds of millivolts at the analog baseband output ($V_{bb}$), the maximum amplification gain of the entire receiver chain must be approximately 70 dB. With a RF front-end gain of 20 dB (Section 6.4), a maximum gain at the VGA of 60 dB will provide some margin for additional variations caused by multipath reflections. The VGA must accommodate down-converted bandpass quadrature IR-UWB signals centered around $f_0 = 150$ MHz and having a -10 dB bandwidth extending up to 250 MHz ($f_{-3dB} \approx 203$ MHz). Simulations show that the higher -3 dB cut-off frequency at 180 MHz is still sufficient to not impair the receiver sensitivity (see Fig. 3.30). Moreover, since the spectral content of the received pulse is negligible
up to several MHz, the lower -3 dB cut-off frequency is specified to a maximum of 10 MHz. This will enable the use of decoupling capacitors for the interstage interconnect and avoids the implementation of a DC offset cancelation loop. The circuit also has to feature an automatic gain control (AGC) loop with a signal level detector to provide a constant output signal amplitude of about 100-200 mV, regardless of variations of the input signal. This way, the VGA-AGC pair is autonomous and does not require any control from the baseband to provide the desired signal amplitude.

6.5.2. Required Gain-Bandwidth Product

A rough estimation of the required gain-bandwidth product is given by the maximum gain of 1000 (60 dB) that has to be reached over a bandwidth of 200 MHz. Therefore we have

\[
GBW = 20 \log(1000 \cdot 200) = 106 \text{ dB} \cdot \text{MHz.} \tag{6.59}
\]

Since this value is larger than the transit frequency of a transistor \((f_t \approx 55 \text{ GHz} \approx 95 \text{ dB-MHz})\), a multi-stage topology is required. It can be shown that the expression of the gain bandwidth product for a cascade of \(N\) identical first-order stages is [145,146]

\[
GBW_N = GBW_c^n \cdot \sqrt{2^{1/n} - 1} = GBW_c \cdot A_{v,DC}^{n-1} \cdot \sqrt{2^{1/n} - 1}, \tag{6.60}
\]

where \(n\) is the number of stages; \(GBW_c\) and \(A_{v,DC}\) are the gain bandwidth product and the DC gain of a single cell, respectively. We now determine \(GBW_c\), the gain-bandwidth product of a single cell placed in a chain of identical stages. The load effect of the following stage, as well as the gate-drain capacitance between the input and the
output are taken into account. For this purpose, we use the Miller approximation. The transfer function of the equivalent circuit depicted in Fig. 6.24 can be expressed as

$$A_v(\omega) = \frac{v_o}{v_i} = g_m R_o \frac{1 - j\omega MC_{gd}/g_m}{1 + j\omega R_o(C_{gs} + MC_{gd})},$$  \hspace{1cm} (6.61)

where $M$ is reflecting the Miller effect; $M \approx 1 + g_m R_L$ for a common-source stage and $M \approx 1 + g_m/g_{m,casc} \approx 2$ for a cascode topology. By neglecting the zero originated by the Miller effect and occurring only at higher frequencies, we obtain a good approximation for the dominant pole

$$\omega_{p,\text{dom}} \approx \frac{1}{R_o(C_{gs} + MC_{gd})}. \hspace{1cm} (6.62)$$

Since the DC gain is $g_m R_o$, we can write the gain-bandwidth product of a cell placed in the amplification chain as

$$GBW_c = \frac{1}{f_T^{-1} + f_M^{-1}}, \hspace{1cm} (6.63)$$

where

$$f_T = \frac{g_m}{2\pi(C_{gs} + C_{gd})}, \hspace{1cm} (6.64)$$

$$f_M = \begin{cases} \frac{1}{2\pi R_o C_{gd}} & \text{common source,} \\ \frac{g_m}{2\pi C_{gd}} & \text{cascode.} \end{cases} \hspace{1cm} (6.65)$$

The term $f_M$ in Eq. 6.63 describes the influence of the gate-drain capacitance in conjunction with the gain (Miller effect) on $GBW_c$ and must be kept high. $f_T$ depends on the technology, whereas $f_M$ shows the effect of the chosen topology. $R_o$ represents the load resistance seen at the output of the transconductance stage. Unfortunately for a common source topology, this value cannot be made small since the DC gain also depends on it. A way to minimize this degradation is the use of a cascode topology to isolate the gain and the gain-bandwidth product.

6.5.3. Design Guideline

The choice of the number of stages is actually not critical for the power consumption. An increase in the number of stages allows a reduction
of $GBW_c$, and by consequence, a reduction of the cell’s bias current (through $f_T$); but this reduction in the cell’s bias current does not allow a significant current reduction of the entire VGA due to the increased number of stages. On the other hand, increasing the number of stages also increases circuit complexity and silicon area. Hence, for our application, the minimum number of stages can be determined from Eq. 6.60 and Eq. 6.63 and lies around $n = 4$.

### 6.5.4. Automatic Gain Control (AGC) Loop

**Block Schematic**

The baseband amplifier chain features an automatic gain control (AGC) loop, whose function is to maintain a constant signal level at the output, regardless of the signal’s variations at the input of the VGA.

![Figure 6.25](image)

**Figure 6.25.** Typical VGA transfer function with AGC loop. The x-axis ($V_i$) represents the input signal amplitude, whereas the y-axis reports the output amplitude $V_o$ of the VGA. The operational range of the AGC-VGA pair lies between $V_{\text{min}}$ and $V_{\text{max}}$. Within this range, the output is ideally constant ($V_o = V_{o,\text{ss}}$) and independent of the input signal amplitude (see text for detailed explanations).

Figure 6.25 shows the typical transfer function of input and output signal amplitudes ($V_i, V_o$) processed by a VGA featuring an AGC loop. For too low input signals ($V_i < V_{\text{min}}$), the AGC leaves its operational range and the VGA gain is set to its maximum $G_{\text{max}}$; hence, the output $V_o$ is a linear function of $V_i$. The range between $V_{\text{min}}$ and $V_{\text{max}}$ corresponds to the operational range of the loop where a stable output signal amplitude $V_o$ is provided regardless of the input amplitude $V_i$. 
Above a certain input level $V_i > V_{\text{max}}$, the AGC may cease to run correctly and bring the VGA output in saturation ($G = G_{\text{min}}$).

The equivalent schematic of the AGC is depicted in Fig. 6.26. The function consists in monitoring the output of the AGC by a level detector, whose output $V_d$ is compared with a reference value $V_{\text{ref}}$. The result of the comparison is fed into a loop filter $F(s)$, which has to provide an adequate and stable control voltage $V_c$ to the VGA.

The AGC loop equations can be written as

\[
V_c = F(s) \cdot (V_{\text{ref}} - V_d) = F(s) \cdot (V_{\text{ref}} - D(V_o)),
\]

\[
V_o = V_i \cdot G(V_c),
\]

where $V_c$ is the control voltage of the VGA, $V_{\text{ref}}$ is the reference voltage, $F(s)$ is the loop filter and $V_d = D(V_o)$ represents the output of the amplitude detector.

**AGC Loop Equation**

In this section, we investigate the dynamic behavior of the AGC loop within its operational range ($V_{\text{min}} > V_i > V_{\text{max}}$, see Fig. 6.25) for an arbitrary detector function $D(V_o)$. The goal is to describe how the loop reacts for small variations of the input signal around a particular operating point and what are the requirements for the different blocks building the ACG loop illustrated in Fig 6.26. The small variation assumption is considered as valid for UWB signals since these experience much less fading than narrow-band signals (see Chap. 3). We first describe the small signal output amplitude variations with respect to a
6.5. The Variable Gain Amplifier (VGA) 229

small change at the input amplitude:

\[
\frac{\partial V_o}{\partial V_i} = \frac{\partial [G(V_c) \cdot V_i]}{\partial V_i} = V_i \cdot \frac{\partial G(V_c)}{\partial V_i} + G(V_c). \quad (6.68)
\]

The derivative of the VGA gain with respect to the input amplitude can be decomposed in

\[
\frac{\partial G(V_c)}{\partial V_i} = \frac{\partial G(V_c)}{\partial V_c} \frac{\partial V_c}{\partial V_d} \frac{\partial V_d}{\partial V_o} \frac{\partial V_o}{\partial V_i} = G' \cdot F(s) \cdot D' \cdot \frac{\partial V_o}{\partial V_i}. \quad (6.69)
\]

By inserting 6.69 in 6.68, we obtain

\[
\frac{\partial V_o}{\partial V_i} (1 + V_i \cdot G' \cdot F(s) \cdot D') = \frac{V_o}{V_i}, \quad (6.70)
\]

and finally, the behavior of the AGC loop for small changes from a particular operating point can be written as

\[
\frac{\partial V_o/V_o}{\partial V_i/V_i} = \frac{1}{1 + V_i \cdot G' \cdot F(s) \cdot D'}. \quad (6.71)
\]

We observe from the expression of the Eq. 6.71 that the loop behavior is dependent on the input amplitude \( V_i \). Since \( V_i \) may vary over more than three decades, the loop dynamic within the operational range may change in an unacceptable way. To suppress this dependence on the input signal amplitude \( V_i \), an exponential VGA gain law with respect to the control voltage \( V_c \) is desired, that is \( G(s) = a^{bV_c} \). The advantage of using exponential gain function is that the derivative of the gain \( G' \) itself contains the expression of the gain \( G = V_o/V_i \), i.e.

\[
G' = b \ln (a) \cdot a^{bV_c} = b \ln (a) \cdot G = b \ln (a) \cdot \frac{V_o}{V_i}, \quad (6.72)
\]

Thus, by inserting the latter equation in 6.71, the denominator becomes only dependent on the output voltage \( V_o \). Since, the latter can be considered constant over the operational range of the AGC, that is, between \( V_{\text{min}} \) and \( V_{\text{max}} \) (Fig. 6.25), the loop dynamic becomes more controllable. Equation 6.71 is rewritten as

\[
\frac{\partial V_o/V_o}{\partial V_i/V_i} = \frac{1}{1 + b \ln (a)V_o \cdot F(s) \cdot D'}. \quad (6.73)
\]

The expression above shows a dependence on the output amplitude \( V_o \). To further make the loop fully independent of the output amplitude, classical AGC’s use a logarithmic detector function
\( D(V_o) = K_D \ln(V_o) \). The derivative of \( D \) now becomes \( D' = K_D/V_0 \), which leads to

\[
\frac{\partial V_o/V_o}{\partial V_i/V_i} = \frac{1}{1 + b \ln (a)K_D F(s)}. \tag{6.74}
\]

This particular case represents the classical “linear-in-decibel” AGC loop topology, where the loop dynamic, defined by Eq. 6.74, is solely determined by design parameters such as the gain constants of the VGA \((a \text{ and } b)\), the detector gain \(K_D\) and the loop filter \(F(s)\).

In order to allow a simpler AGC design without logarithmic detection, we study now the behavior of a loop featuring an exponential gain VGA but using an arbitrary detector function \(D(V_o)\). First, the steady-state variation at the output, in response to a variation in the input, can be written as

\[
\frac{\partial V_o/V_o}{\partial V_i/V_i} = \frac{\partial V_i/V_i}{1 + b \ln (a)F(0)D'V_0}, \tag{6.75}
\]

where \(F(0)\) is the DC gain of the loop filter. Thus, for any variations of the input amplitude \(\partial V_i/V_i\), it is desirable to keep the change in \(\partial V_o/V_o\) as small as possible. A way to obtain such a behavior is to make the DC loop gain of the loop filter as large as possible. One method is to use an integrator as the filter \(F(s)\), i.e. \(F(s) = C/s\). This choice brings a second advantage in the sense that it makes steady-state error no longer dependent on the output voltage \(V_0\). Therefore, any kind of detector (even those not removing \(V_0\) at the denominator of Eq. 6.75) can be used. Furthermore, with an integrator loop filter \(F(s) = C/s\), Equation 6.75 becomes a first order high-pass function \(\partial V_o/V_o \to 0\) when \(t \to \infty\) for any variations of the input signal \(\partial V_i/V_i\), which shows a zero at the origin and a pole at

\[
\omega_p = Cb \ln (a)D'V_{o,ss}, \tag{6.76}
\]

where \(V_{o,ss}\) is the desired steady-state output amplitude \((V_o = V_{o,ss}\) in the operational range of the AGC loop, see Fig. 6.25). This first-order transfer function also ensures a stable loop behavior. Stability analysis of AGC circuits has already been extensively investigated in [147].

Detector and Loop Filter: Implementation

In this section, we investigate the influence of a very simple detector based on the nonlinear \(I_{ds}(V_{gs})\) curve of a MOS transistor. This
6.5. The Variable Gain Amplifier (VGA)

detection circuit behaves like a half-wave rectifier and is depicted in Fig. 6.27 (“HWR” block and gray-shaded area). This realization can be made even more practical with the use of zero-Vt transistor provided by the BiCMOS7WL technology. Zero-Vt transistors have a reduced offset (small threshold voltage $V_{th}$) between output and input for the rectifying function and hence helps to improve de dynamic range of the detection.

**Signal detection principle:** The VGA output $V_{cm} + V_{o \pm}$ is applied to the gate of a source follower NFET $M_d$, whose source is connected to a capacitor $C_d$. When a positive voltage occurs, the transistor enters its conduction mode and loads the capacitor $C_d$. When the gate voltage drops back to values smaller than the source, the transistor returns in the blocking state. The current source $I_b$ discharge the capacitor slowly to recover initial state in the absence of signal. The current source thus also defines a decay time or can be used as a reset device for the capacitor (not used in this design). Furthermore, $I_b$ serves as a bias device for $M_d$ such that this transistor has a defined mode of operation. This feature will be also used in the settling of the internal reference $V_{ref}$ described later in the next paragraph. Owing to the quadratic law between the gate-source voltage and and the drain current of $M_d$, the detector law can be modeled by a square law function, i.e. $D(V_o) \approx K_D \cdot V_o^2$.

![Figure 6.27. Detection principle used in the AGC loop and time domain plot of the VGA control signal $V_c$ for incoming baseband pulses.](image-url)
Chapter 6: Analog Front-End Receiver

The typical transfer function of the HWR block is given in Fig. 6.28 for an input pulse amplitude varying over three decades from 1 mV to 1 V. The equivalent transfer function can be well approximated by a square-law function with a multiplying factor $K_D \approx 0.2$.

**Reference voltage generation:** The internal reference voltage $V_{ref}$ is indirectly generated by a current source $I_{ref}$, which can be externally adjusted and whose current is passed through a resistance $R_{ref}$. The resulting signal $V_{cm} + V'_{ref}$, where $V'_{ref} = I_{ref} \cdot R_{ref}$, is passed through a replica of the rectifying block HWR. This provides a way to remove the effect of the unknown threshold voltage of the transistor $M_d$. The error voltage $V_{err}$ is generated from the difference between $(V_{d+} + V_{d-})$ and $V_{ref}$. $V_{err}$ then feeds the loop filter $F(s) = -\frac{g_m}{sC_{ext}}$ to generate the control voltage $V_c$.

**AGC Loop Behavior with Squarer Detector**

Using a square-law detector, the pole of the AGC loop transfer function within the operating range can be well approximated by

$$\omega_{p,squ} \approx \frac{g_m}{C_{ext}} b \ln (a) 4K_D V_{o,ss}^2,$$  \hfill (6.77)
where $V_{o,ss}$ is the desired steady-state output amplitude. It is interesting to understand what happens when the AGC loop is not close to its steady state. In the absence of signal, the value $V_{o,ss}$ consists only of noise and is smaller than the targeted value of the operational range; the pole consequently decreases and the loop behaves like an integrator. Above $V_{max}$, the output increases and the pole becomes larger. This leads to a larger loop bandwidth and consequently a faster reactivity of the loop. This properties is interesting in stand-alone AGC without gated function which holds the VGA gain during absence of signal (e.g., such as in TV receiver). The large $V_o$ increases the loop bandwidth and thus brings the AGC loop faster to a value close of the steady state. This can be observed in simulations illustrated in Fig. 6.29, where the 20 dB step response of the AGC loop with a squarer detector (continuous curve) is compared with a classical logarithmic detection, which is also described by a first order high-pass transfer function (dashed curve) with a pole defined by Eq. 6.77. It can be observed that right after the step input, between $t = 0$ and $35 \mu s$, the loop reacts much faster (larger bandwidth). Once the output signal has dropped, the output is then characterized by a slower time constant.

\begin{figure}
\centering
\includegraphics[width=0.5\textwidth]{figure6.29}
\caption{Step response of an AGC loop with square-law detection. The first-order approximation based on the pole given in Eq. 6.77 (dotted line) fits the simulations well (continuous line).}
\end{figure}
We thus demonstrated that, assuming an exponential gain VGA that removes the input amplitude dependence, an AGC using a simple square-law device shows dynamic behavior that can be employed for the baseband section of the UWB receivers. Within the operational range, the transfer function of the output amplitude vs. the input amplitude can be well approximated by a first-order high-pass filter transfer function, whose pole is determined by the steady-state output $V_{o,ss}$ fixed by the reference voltage $V_{ref}$, the external capacitance $C_{ext}$, the OTA’s $g_m$ and parameters $a$ and $b$ of the exponential gain VGA device.

### 6.5.5. Amplification Cell

In Section 6.5.4, we showed the importance of the exponential gain characteristic to remove the dependence of the loop dynamic on the input amplitude. Usually, a complex circuit is implemented to obtain an exponential relationship between gain and control voltage. Using CMOS technology, a simple solution exists and has been proposed in [148]. It consists in bringing the input transistors of a differential pair into triode mode by reducing the drain-source voltage. The starting point for the equation derivation is illustrated in Fig. 6.30. This figure depicts the equivalent half-circuit of a cascode differential pair, where the input common-source transistor $M_1$ is set in triode by applying the appropriate voltage $V_c$ at the gate of the cascode transistor $M_2$. By using the Kirchhoff’s voltage law (KVL), we obtain

$$V_c - V_{cm} + V_{gs1} - V_{gs2} - V_{ds1} = 0. \tag{6.78}$$

![Figure 6.30. Basic schematic showing the principle of the exponential gain law. This figure represent the equivalent half-circuit of a cascode transistor pair.](image-url)
We replace $V_{gs1}$ and $V_{gs2}$ by the classical equations of triode and saturation regions, respectively, that is

\[
V_{gs1} = \frac{I_{ds}}{\beta V_{ds}} + \frac{V_{ds1}}{2} + V_{th1}, \quad (6.79)
\]

\[
V_{gs2} = \sqrt{\frac{2I_{ds}}{\beta}} + V_{th2}, \quad (6.80)
\]

where $\beta = \mu_n C_{ox} W/L$ is the transistor gain factor and $V_{th[1,2]}$ are the threshold voltages. The goal is to express the transconductance $g_{m1}$ as a function of the control voltage $V_c$. By putting $V_{gs[1,2]}$ in 6.78, we obtain, after a few calculations, an expression of the drain-source voltage $V_{ds1}$ as a function of $V_x(V_c)$

\[
V_{ds1} = \sqrt{\frac{I_{ds}}{\beta}} \left( V_x + \sqrt{V_x^2 + 1} \right), \quad (6.81)
\]

where $V_x$ is the normalized control voltage and can be written as

\[
V_x(V_c) = \frac{(V_c - V_{th2}) - (V_{cm} - V_{th2})}{2\sqrt{I_{ds}/\beta}} - \frac{1}{\sqrt{2}}. \quad (6.82)
\]

Then, by expressing the transconductance of a transistor in triode

\[
g_{m1} = \frac{\partial I_{ds}}{\partial V_{gs}} = \frac{\partial}{\partial V_{gs}} \left[ \beta \left( V_{gs} - V_{th} \right) - \frac{V_{ds}}{2} \right] V_{ds} = \beta V_{ds1} \quad (6.83)
\]

we can rewrite Eq. 6.81 as

\[
g_{m1} = \sqrt{\beta I_{ds}} \left( V_x + \sqrt{V_x^2 + 1} \right) \quad (6.84)
\]

The expression $V_x + \sqrt{V_x^2 + 1}$ of Eq. 6.84 can be developed in Taylor’s series and have quasi-exponential behavior, that is

\[
V_x + \sqrt{V_x^2 + 1} = 1 + V_x + \frac{1}{2} V_x^2 - \ldots \approx e^{V_x}. \quad (6.85)
\]

The resulting transconductance $g_{m1}$ can be approximated as

\[
g_{m1} \approx K_G \cdot \exp \left[ \frac{V_c}{2\sqrt{I_{ds}/\beta}} \right], \quad (6.86)
\]
where
\[ K_G = \sqrt{I_{ds} \mu_n C_{ox} \frac{W}{L}} \cdot \exp \left[ \frac{-V_{th2} - (V_{cm} - V_{th2})}{2\sqrt{I_{ds}/\beta}} - \frac{1}{\sqrt{2}} \right]. \]

(6.87)

Thus, in triode region, the transconductance of the common-source transistor of a cascode stage exhibits a quasi-exponential behavior on \( V_c \). The exact value of the transconductance given in Eq. 6.84 is plotted in Fig. 6.31 and compared with an exponential curve. We observe that the equation 6.84 can be well approximated by an exponential law over a large dynamic range of 15 dB with an error of less than \( \pm 1 \) dB. Therefore, cascading four of such stages will provide a quasi-exponential gain range of 60 dB, thus fulfilling the requirement on the VGA gain behavior stated in the previous sections.

\[ V_{x} + \sqrt{V_{x}^2 + 1} \exp(V_{x}) \]

\[ -1.5 -1 -0.5 0 0.5 1 1.5 \]

\[ -10 -5 0 5 10 15 \text{ dB} \]

\[ V_{x}(V_c) \text{ [-]} \]

\[ g_{m}/g_{m|V_{x}=0} \text{ [dB]} \]

\[ g_{m}/g_{m|V_{x}=0} \text{ [dB]} \]

\[ \exp(V_{x}) \]

\[ 15 \text{ dB} \]

\[ V_{x}(V_c) \text{ [-]} \]

\[ \text{Error [dB]} \]

\[ V_{x}(V_c) \text{ [-]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]

\[ \text{Error [dB]} \]
ment of a conventional PMOS load with a NMOS transistor used as a feedback. This technique is often used to enhance the bandwidth of IF- or baseband-amplifiers to several hundreds of MHz without the need of passive inductors and with a very little power consumption penalty (which is actually not proportional to the -square of- bandwidth extension).

UWB baseband circuits must work in this range of bandwidth (typically up to 500 MHz) and several papers have already proven the efficiency of such a solution [148–151]. In our case, the baseband bandwidth is limited to approximately 200 MHz (see Fig. 3.30). We will thus limit our study to an IF amplifier having a bandwidth in this order of magnitude.

First, a small-signal equivalent circuit of the active inductor is developed in Fig 6.32. The input impedance of the active PMOS inductor is given by the following equation (for the sake of simplicity, we do not include explicitly the output conductance $g_{oP}$ of the PFET into the

![Figure 6.32.](image_url)

**Figure 6.32.** a) Basic schematic showing the principle of the active load (parasitic conductance and capacitances are not shown, see text for detailed explanation); b) equivalent load impedance $|Z_L|$
equation):

\[
Z_L = \frac{v_o}{i_o} = \frac{g_x + g_mN + s(C_{gsN} + C_x)}{s^2C_{gsN}C_x + sC_{gsN}(g_x + g_mN) + g_mPg_mN} \| g_{oP},
\]

where

\[
g_x = g_{o,bias} + g_{oN} + g_{mbN},
\]

\[
C_x = C_{gsP} + C_{bst}.
\]

The conductance \(g_x\) include the non-ideal (trans-)conductances, such as the output conductance \(g_{o,bias}\) in parallel with the ideal current source \(I_b\), the output transconductance \(g_{oN}\) and the body-effect transconductance \(g_{mbN}\) of the NFET Mn. The capacitance \(C_x\) includes both the gate-source capacitance \(C_{gsP}\) of the PFET and the boost capacitor C. The PFET active inductor is thus formed by the addition of a second-order low-pass (real part of the numerator) and bandpass function (imaginary part of the numerator).

The denominator of fraction of the Eq. 6.88 can be rewritten in terms of a resonance frequency \(\omega_0\) and a quality factor \(Q\), \(s^2 + s\frac{\omega_0}{Q} + \omega_0^2\), where:

\[
\omega_0^2 = \frac{g_mPg_mN}{C_{gsN}C_x},
\]

\[
Q = \frac{\sqrt{C_x}}{C_{gsN}} \frac{\sqrt{g_mPg_mN}}{g_x + g_mN} = \frac{\omega_0C_x}{g_mP + g_x}.
\]

Around the resonance frequency, the imaginary part of the numerator dominates and the peak impedance of the active load can thus be estimated as:

\[
Z_{L,max} \approx \frac{C_{gsN} + C_x}{C_{gsN}C_x} \frac{Q\omega_0}{\omega_0} \| g_{oP} = \frac{C_{gsN} + C_x}{C_{gsN}(g_mP + g_x)} \| g_{oP}
\]

6.5.7. Measurements of the Implemented IC

For testing purposes, a chip including two VGA’s for the I and Q channels has been implemented separately. The performances of the dual-channel VGA have been measured and are detailed in this section. A chip micrograph is of the circuit is shown in Fig. 6.33.
6.5. The Variable Gain Amplifier (VGA)

The circuit is based on a fully differential topology and is compatible with the receiver front-end described in Section 6.4. A simplified schematic showing only one channel of the VGA/AGC pair is plotted in Fig. 6.34. Note that both channels actually feature a detector section. On the other hand, the reference generation circuit and the loop filter of the AGC loop section are common to both channels. To improve the symmetry between the I and Q channel, the I/Q VGA cells have been placed in staggered rows (dashed line on Fig. 6.33). Classical design techniques, such as common centroid layouts and dummy units, have been used to improve the symmetry.

Figure 6.34 further illustrates the way VGA cells are interconnected. In this design, no sophisticated offset compensation circuit is needed, since the received UWB signal is characterized by almost no energy up to 50 MHz. The high-pass corner frequency around 10 MHz allows to simply realize interstage coupling by means of series capacitors. Common-mode feedback regulation (“CMFB” block) is also depicted. This is realized with the help of an OTA and an external voltage reference $V_b$. This integrated circuit will be used as a baseband amplifier between the front-end and the demodulator for the BER measurements given in Chapter 7.
Figure 6.34. Simplified schematic of the VGA, including square-law detector and AGC loop through $V_c$. The test chip includes two channels for I and Q baseband signal; only one VGA is depicted here for the sake of clarity.
6.5. The Variable Gain Amplifier (VGA)

Gain and Bandwidth

The voltage gain transfer function $A_v(f) = V_o/V_i$ is shown in Fig. 6.35 for four different control voltages $V_c$. The VGA exhibits a voltage gain between 0 and 60 dB. For illustrative purpose, the transfer function is compared with the Fourier transform of the received IR-UWB pulse (dashed line). The -3 dB cut-off frequencies for both the transfer functions and the Fourier transform of the received signal are indicated by empty and plain triangle markers, respectively.

The “linear-in-dB” characteristic of the VGA gain at offset frequency $f_o = 150$ MHz is reported in Fig. 6.36. We observe that, for a control voltage $V_c$ varying between 0.7 and 1.3 V, the gain variation in decibels follows a quasi-linear law between 0 and 50 dB.

As summarized in Fig. 6.37, the high-pass and low-pass corner frequencies are constant over a large part of the gain range. Close to the maximum gain, a bandwidth extension toward higher frequencies can be observed. This is caused by a slight saturation of the last stage of the amplifier chain. Furthermore, the low-pass cut-off frequency remains above the minimum specified frequency of 180 MHz, which is close to the the -3 dB bandwidth of the pulse spectrum.

Linearity

The linearity is not a critical parameter for the VGA since it is assumed that intermodulation products affecting the baseband originate from the LNA and the mixer. However, from the point of view of the signal demodulation, a certain level of linearity is required for signals incoming at the demodulator input. This can be explained by the way the signal provided by the VGA is demodulated and detected. The demodulation makes use of a quadricorrelation method, which essentially uses the information carried by the phase difference between the I and Q paths (see Section 3.7). Ideally, no information is carried by the signal amplitude at this level and a strong nonlinearity should theoretically not affect the receiver performance. On the detection side, the energy of demodulated signal is however integrated and, by consequence, amplitude information (actually the polarity of the demodulated signal) is used the determine the value of the transmitted bit. When the signals at the I and Q paths are strongly compressed or saturated (strong nonlinearity), the amplitude information that is used at the detector may be partially lost with regard to the noise level, thus resulting in a degradation of the SNR. This is illustrated in Fig. 6.38.
Figure 6.35. Measured gain transfer functions of the VGA for different control voltages $V_c$ from 0.72 to 1.4 V. The gain can be varied between 0 and 60 dB. The thick dashed line represent the Fourier transform of the baseband pulse (the vertical dashed line locates the pulse’s center frequency $f_o = 150$ MHz).

Figure 6.36. Measured and simulated VGA gains at the pulse center frequency ($f_o = 150$ MHz) for different control voltages $V_c$ from 0.72 to 1.4 V.
6.5. The Variable Gain Amplifier (VGA)

Figure 6.37. Measured high-pass (a) and low-pass (b) cut-off frequencies vs. VGA gain. The -3 dB cut-off frequency stays larger than the minimum specified cut-off frequency (dotted line) over the entire VGA gain range.

In a multipath-free scenario, where the incoming pulse duration is much smaller than the integration window, this unreliable amplitude information, which occurs mostly between pulses, may reduce the value at the integrator output and is equivalent to a decrease of the SNR (lower BER). This is the reason why classical hard-limiting amplifier cannot be used with energy collection schemes. Actually, it can be shown that the linearity requirement at the output of the VGA is directly dependent on the ratio between the integration length and the pulse duration. However, in real environments, multipath fading somewhat extends the pulse duration and helps to reduce the linearity requirement. The effect of the distortion must actually be considered at the input of the demodulator and the linearity requirement for our VGA is thus defined at its output. We characterize the linearity by the 1-dB output-referred compression point \( OCP_{1dB} \). Simulation show that this value has to be in the order of 100 mV (-20 dBV). Figure 6.39 shows the measured linearity at the maximum gain. It shows an OICP around 21.5 dBV, which is close to the required value at the input of the demodulator.
Noise

The noise contribution of the VGA has also been characterized. This noise contribution has to be negligible compared with the RF front-end. The maximum contribution to the overall noise figure has been fixed to $\Delta NF = 0.5$ dB. Assuming a minimum RF front-end gain of $G = 18$ dB (see Fig. 6.21), the maximum equivalent noise figure of the VGA can be calculated from the well-known Friis formula and is, in terms of noise factor,

$$F_{VGA} \leq \Delta F \cdot G + 1. \quad (6.90)$$
This corresponds to a noise figure of $NF_{\text{VGA}} < 14$ dB. The measured equivalent noise figure of the VGA is given in Fig. 6.40. The results are given for frequencies between 100 and 200 MHz and correspond to the frequency range where the pulse energy is located. The obtained NF fulfills the requirements for high- and mid-gains. At lower gain, the NF increases owing the the VGA stages in deep triode mode. However, this gain setting corresponds to high input SNR (typically above 20-30 dB) and thus, the receiver can withstand an increased noise figure without significant loss in the BER performances.

**I/Q Imbalance**

The imbalance between the I and the Q paths is an important parameter for the demodulation of the analog signal. An I/Q imbalance reduces the SNR and, consequently, the BER performance. Figure 6.41 shows the measured imbalance in both amplitude and phase over two decades from 3 to 300 MHz. These measurements have been obtained by the use of mixed-mode S-parameters [152]. The layout techniques used in this design help in reducing mismatches to values down to $\pm 0.3$ dB and $\pm 3^\circ$ at the maximum gain and without any calibration.
Figure 6.40. Measured VGA noise figures for three typical gain settings with source resistance $R_S = 50\,\Omega$. The obtained NF fulfills the requirements ($NF \leq 15\,\text{dB}$) for high- and mid-gains owing to the first VGA stage, which has a fixed gain and a cascode transistor, whose source (input transistor of the differential pair in saturation) has a high output resistance.

AGC Behavior

The static behavior of the AGC loop is illustrated in Fig. 6.42-a. The input signal is varied from -40 dBmV to +35 dBmV. Measurements show that for an input dynamic range of 50 dB, as specified in Section 6.5.1, the output signal only varies over 3.6 dB.

The dynamic behavior of the AGC loop is given in Fig. 6.42-b. This plot represents the screen copy of the measurement of the control voltage $V_c$ during the incoming of a test signal. The test signal consists of a train of baseband IR-UWB pulses with a repetition rate of 10 MHz beginning at $t = 20\,\mu\text{s}$ and whose amplitude has been set to 20 mVpp. This represents the case of a VGA set to its maximum gain. A settling time of 20 $\mu\text{s}$ is observed for a gain variation of 40 dB. This fast settling time of the AGC enables the minimization of the overhead time needed to obtain a valid signal at the input of a following demodulator or A/D converter. This settling time can be adjusted by the external capacitor $C_{\text{ext}}$. 
6.5. The Variable Gain Amplifier (VGA)

Figure 6.41. Measured amplitude (a) and phase I/Q imbalance (b) at maximum gain (60 dB). Below 200 MHz, amplitude and phase imbalances stay smaller than $\pm0.3$ dB and $\pm3^\circ$, respectively. At a gain of 30 dB ($V_c = 1.1$ V), the measured values are reduced to $\pm0.1$ dB and $\pm0.5^\circ$.

6.5.8. VGA Performance Summary

In this section, a 0 to 60 dB gain dual-channel VGA with an AGC loop has been specified, implemented and characterized. It has been shown that the classical logarithmic law at the detector is not mandatory when using a VGA featuring an exponential gain law. Thus, simpler circuit implementation based on a single transistor half-wave rectifier can be employed. This solution replaces advantageously the classical logarithmic detector in terms of circuit complexity but lacks of predictability in the time domain response. This has been considered as not critical for our application. However, the response of the AGC loop can be well approximated by a first-order high-pass filter. A summary of the VGA performance is given in Table 6.2.
Figure 6.42. a) VGA input-output transfer function with AGC enabled in static mode ($f_{in} = f_o = 150$ MHz), b) AGC loop dynamic behavior with a train of IR-UWB pulses applied during 100 $\mu$s ($V_{pp, in} \approx 20$ mV).

Table 6.2. VGA performances summary

<table>
<thead>
<tr>
<th>Gain</th>
<th>min</th>
<th>max</th>
</tr>
</thead>
<tbody>
<tr>
<td>Gain range</td>
<td>0 dB</td>
<td>60 dB</td>
</tr>
<tr>
<td>Bandwidth</td>
<td>&gt;180 MHz</td>
<td></td>
</tr>
<tr>
<td>Noise figure ($R_s = 50 \Omega$)</td>
<td>$\approx 21$ dB</td>
<td>$\approx 12$ dB</td>
</tr>
<tr>
<td>IIP3</td>
<td>-14 dBV</td>
<td>-50 dBV</td>
</tr>
<tr>
<td>OICP</td>
<td>-21.5 dBV</td>
<td></td>
</tr>
<tr>
<td>I/Q imbalance</td>
<td>&lt; $\pm 0.3$ dB</td>
<td>&lt; $\pm 3^\circ$</td>
</tr>
<tr>
<td>AGC output variation</td>
<td>3dB for 50 dB input</td>
<td></td>
</tr>
<tr>
<td>AGC settling time</td>
<td>20 $\mu$s (40 dB range)</td>
<td></td>
</tr>
<tr>
<td>Supply current @ 1.8V</td>
<td>5 mA</td>
<td></td>
</tr>
</tbody>
</table>

6.6. Receiver Summary

In this Chapter, we developed, designed and characterized several prototype IC’s to evaluate the requirements of an entire receiver front-end dedicated the down-conversion and the amplification of carrier-based IR-UWB signals between 3.3 and 4.8 GHz (lower UWB frequency band). The front-end integrates a low noise amplifier (LNA),
two (I&Q) down-conversion mixer and a dual-channel variable gain amplifier (VGA) with an automatic gain control (AGC) loop.

The different test chips have been realized in CMOS 0.18 μm and operate from a 1.8 V supply. The total power consumption, without the test output buffer, reaches 24 mW. A minimum noise figure of 6.5 dB with a gain larger than 20 dB could be reached for the RF front-end. The overall RF-to-baseband gain, including the VGA, is larger than 80 dB, which is more than sufficient to compensate range variations over 10 m and multipath fading effects.
7.1. The Digital Baseband

After the down-conversion and the amplification by the analog front-end, the incoming quadrature analog signal has to be processed by a back-end entity to extract data (decoding) and timing (synchronization). Most of the previously published works targeted for carrier-based IR-UWB signalling [153–155] have reported signal detection methods based on a coherent scheme. Although powerful in terms of bit error rate (BER) performance, these methods require very accurate synchronization on the incoming signal phase and very complex setting of the receiving matched filter due to unpredictable multipath profile of the communication channel. Therefore, the advantage gained by the improved BER performance of coherent schemes comes at a cost of prohibitively high complexity and processing delays that increase the power consumption. An additional difficulty appearing with reduced pulse repetition rates employed by IR-UWB systems is the inherently duty-cycled characteristic of the signal down to a few percent. Signal acquisition may consequently become a time-consuming task and/or would require long synchronization preambles that further reduce the overall power efficiency of the data decoding process.

Our approach uses noncoherent demodulation methods and “energy collection” principle that has the advantage of a reduced computation complexity at a cost of a slightly decreased BER performance (approx. 2 dB for $B_{RF}T_i = 1$ and 5 dB for $B_{RF}T_i = 10$, see Fig. 3.10). In
this Chapter, we describe a mixed-signal UWB baseband application-specific integrated circuit (ASIC) that addresses the issues of low-duty cycle IR-UWB signals in terms of detection, acquisition and synchronization.

As depicted in the block schematic of the mixed-signal baseband ASIC (Fig. 7.1), the first task of the back-end processor is to demodulate the signal from the quadrature IR-UWB baseband signals provided by the VGA to obtain a bipolar analog signal $V_{\text{dem}}$ (see also Fig. 3.16). This function will be implemented by an analog I/Q demodulator described in the next Section. Following the analog demodulation, a signal detection device employing a non-coherent “energy collection” method based on an “integrate & dump” operation integrates the received energy during a period of time $T_i$ (see Section 3.8). This analog signal is further converted into a digital signal by a 1-bit analog-to-digital block based on comparators. Finally, a digital synchronization block (Section 7.3) is responsible of the initial signal acquisition and signal tracking. This block finds the correct alignment of the $T_i$-duration integrating window relative to the incoming pulse. It also generates the “dump” signal for resetting the analog integrators.

![Simplified block schematic of the mixed-mode baseband ASIC](image)

**Figure 7.1.** Simplified block schematic of the mixed-mode baseband ASIC. The realized IC consists of four main functional blocks: 1) the quadrature signal demodulation, 2) the noncoherent energy detection, 3) the analog-to-digital conversion and 4) the digital section responsible for signal acquisition and tracking.
7.2. The Analog Section: Demodulation & Detection

7.2.1. Principles

The ASIC’s analog front-end does not rely on the use of high speed analog-to-digital converters (ADC) as it is the case in most of the other referenced works. It relies on a non-coherent demodulation method followed by a simple I&D detection. Both functions have been implemented by means of analog circuitry, as shown in Fig. 7.2. The motivation of this strategy is to reduce the power consumption by avoiding the implementation of Nyquist rate ADC’s and power-hungry DSP’s running at full speed to process the received signal (e.g. channel equalization or RAKE receiver). The analog part is common for both signal acquisition and signal tracking. As explained later in Section 7.3, the signal acquisition employs five I&D devices (N=5) that cover an entire pulse period, whereas for signal tracking, a modified “early/late” algorithm using three I&D blocks is used.

The main advantage of such an implementation is the scalability of the power consumption with the received PRR. ADC-based solutions require clock speeds fixed by the input signal’s bandwidth, whereas for I&D-based detectors, the main clock is only dependent on PRR. This method proves to be particularly efficient at low PRR. In very dense multipath environments, the PRR is limited by interpulse interference (IPI) to values down to 5 MHz for channel CM8 [51], as shown by simulations in Section 3.8. Under these conditions, the clock of an I&D-based ASIC can be scaled down proportionally to reduce the power consumption by the same order of magnitude.

7.2.2. Quadrature Pulse Demodulation

The quadrature baseband pulses are demodulated with the help of an analog quadricorrelator, whose output restores the RF pulse envelope [71]. A block schematic of this analog circuit is given in the leftmost part of Fig. 7.2. Mathematically, the demodulation operation can be written in time-domain as

\[
V_{dem}(t) = V_I(t - \delta) \cdot V_Q(t + \delta) - V_Q(t - \delta) \cdot V_I(t + \delta)
\]

(7.1)

where \(V_I(t)\) and \(V_Q(t)\) are the in-phase and quadrature baseband signal, respectively. The delay \(\delta\) corresponds to a \(\pi/4\) phase shift. Since both signals are centered around \(f_o\) (i.e., the offset frequency of the BFSK modulation), \(-\pi/4\) and \(+\pi/4\) phase shifts can be implemented.
Figure 7.2. Block schematic of the analog front-end of the baseband ASIC showing the quadricorrelation-based demodulation (left-hand part), the detection principles with the $N = 5$ consecutive $T_i = 40$ ns integrations (center part) and the 1-bit A/D conversion (right-hand part). The corner frequency of the low- and high-pass filters of the demodulator corresponds to the BFSK offset modulation $f_o$. 
by means of a first-order low-pass filter (LP) and a high-pass filters (HP), respectively. The corner frequencies are set equal to the modulation frequency offset $f_o$ Fig. 7.3-a, i.e., $f_{c,LP} = f_{c,HP} = f_o$. The first-order filters are $g_{m}$-$C$ cells realized with a folded cascode operational transconductance amplifier (OTA) and two 50 pF on-chip capacitors. The resulting demodulated signal $V_{dem}$ is proportional to the envelope of the incoming I/Q signal and has the form of a train of return-to-zero (RZ) bipolar pulses.

Note that the implementation of the quadricorrelation-based demodulator presented in this section differs from the one based on comparators and S&H devices described in Chapter 3 (Fig. 3.16 and Fig 3.18). The proposed implementation replaces the nonlinear S&H operation by a multiplication of phase-shifted signal. The main reason comes from the fact that an implementation using comparators and S&H’s, as depicted in Fig. 3.18, introduces an excessive delay that prevents the sampling of the baseband signal of the other branch at its maximum. This non-optimum sampling instant reduces the SNR at the output of the demodulator and, consequently, the resulting BER. This effect is accentuated when low current consumption are targeted. Circuit simulations showed us that the use of $g_{m}$-$C$ cells for the implementation of the filters and Gilbert-cells for the multiplication operation of Eq. 7.1 considerably reduces the design constraint in term of power consumption. Moreover, this solution allows an optimum signal demodulation since the signals in each of the I and Q paths are equivalently loaded by the demodulator input (the input circuit of the LP and HP consists of an identical differential pair). Therefore, I and Q signals are equally affected before the multiplication operation and the resulting output envelope $V_{dem}$ is maximized.

### 7.2.3. Signal Detection

The signal detection method is based on an energy-collection scheme. The demodulated output signal $V_{dem}$ is passed through an I&D section (center part of Fig. 7.2), whose main function is to accumulate and hold the demodulated signal. The block schematic of one I&D block is depicted in Fig. 7.3-a. The integration of the incoming signal is enabled when the input multiplexer selects the differential analog $V_{dem+} - V_{dem-}$ (“WinCtrl” signal). The hold state occurs by switching the input back to the reference voltage $V_{ref}$; this is equivalent to set $V_{dem} \triangleq 0$. The integrator’s transconductance is a fully
differential folded cascode OTA with common mode feedback (CMFB). The reference signal for the common mode voltage $V_{cm}$ is $V_{ref}$. The transconductance of the OTA is 150 $\mu$S and enables sufficient voltage swing and high output impedance to drive the 10 pF integrating capacitors $C_i$. The simulated integrator’s bandwidth, defined by the first and second pole of the transfer function, covers a frequency range from $f_{p1} < 100$ kHz to $f_{p2} \approx 1$ GHz (Fig. 7.3-b). This bandwidth ensures a quasi-ideal integration of the demodulated pulses, whose -10 dB bandwidth extends typically up to 200 MHz. The reset circuit is realized using two transmission gates that pull the two capacitors to the common mode voltage in less than 20 ns.

The integration time $T_i$ for the signal detection is related to the

Figure 7.3. a) Block schematic of the I&D function. The integrator’s transconductance is a fully differential folded cascode OTA including common mode feedback circuit, $V_{ref}$ serves as a reference voltage for the common mode voltage $V_{cm}$; b) Typical transfer function of the integrator with poles; c) timing diagram of the integrate and dump operation; the dot markers identify the sample operation that reads the output value of the I&D devices.
characteristics of the propagation channel. Performances of the receiver with IEEE indoor UWB channel models CM1-4 and CM7-8 [51] have been simulated with respect to different $T_i$ in Fig. 3.25. Some flexibility in the baseband ASIC is provided to accommodate the different multipath channel scenarios. For instance, the ASIC can be driven by an external clock at different frequencies to enable PRR from 1 to 20 MHz. The corresponding integration times $T_i$ are scaled proportionally from 200 ns to 10 ns. This range can thus match the optimum integration times of the six typical indoor environments.

7.2.4. Comparator Bank

The outputs of the I&D elements feed the comparators, which are responsible for the analog-to-digital conversion. Contrary to digital signals having well-defined levels, the outputs of the I&D devices are analog signals with no predictable level. Therefore, sufficient speed and gain must be achieved in the comparison operation to restore a digital level. Comparators using a track-and-latch topology [156] meet these requirements.

Figure 7.4 shows a simplified schematic of a comparator. During the track phase, the comparison occurs continuously and fast, but due to small gain, no useful signal can be exploited at the output. By setting

![Figure 7.4. Simplified schematic of the track-and-latch comparator (two inputs). The track phase occurs during the integration operation, whereas the latch phase corresponds to the conversion of the analog comparison into a digital signal (dot markers in Fig. 7.3-c).](image-url)
the gate signal of transistor M10 to low, the comparator enters the latch phase (signal polarity restoration). The diode-connected transistors M8 and M9 are switched off and the output latch flips to one of its two states, depending on the last result of the comparison. In Fig. 7.3-c, this latch phase occurs right before the reset (dump) operation of the I&D device. There are three different types of comparators in the analog front-end; they only differ by their number of inputs. Comparators 0-9 in Fig. 7.2 have two inputs, the summing comparator has five inputs (“MinMax”) whereas the “On-Time” device has one. The outputs of these comparators will be processed by the digital circuit described in the following Section 7.3.

7.3. The Digital Section: Synchronization Algorithm

In order to provide reliable data, the baseband ASIC must synchronize on the incoming IR-UWB signal. The difficulty in synchronizing on these signals is accentuated by the fact that information bearing waveforms are impulse-like and can have a very low duty cycle. This may occur especially in the absence of multipath distortion. On the other hand, UWB synchronization must be robust enough to find a signal in dense multipath environments and simple enough to enable a low power implementation. The proposed synchronizing algorithm relies strongly on the proposed energy-collection scheme to overcome these issues. From an implementation point of view, it is realized by a digital sub-circuit, which has been embedded with the analog demodulation. The task of the digital circuit is to process the information provided by the 1-bit analog-to-digital conversion to extract the correct data and the received signal clock.

The digital subsection consists of two main paths: 1) the acquisition path and 2) the tracking path. These two paths are identified by gray-shaded blocks in the top level schematic of the digital section illustrated in Fig. 7.5. The synchronization is carried out differently depending on whether no signal had been previously found (initial acquisition or “cold start”) or whether signal has been found after an initial acquisition. In the latter case, the signal has to be tracked to compensate clock offsets or multipath variations that may slightly change the position of the maximum energy within a pulse period.
7.3. The Digital Section: Synchronization Algorithm

7.3.1. Issues during Initial Signal Acquisition (Cold Start)

During the initial acquisition of the UWB pulses (cold start phase), a simple and classical “early/late” algorithm used to find the IR-UWB pulse’s main peak is not sufficient due to the reasons illustrated in Fig. 7.6 and enumerated hereafter:

a) the algorithm may fall into a “trap” i.e., a local maxima, which is not the main energy peak;

b) no reliable information is present, e.g. between two pulses;

c) the algorithm receives ambiguous shifting information. When the synchronization algorithm is locked around local minima, the “early” and “late” states cannot provide reliable shifting information for the synchronization toward the main peak.

7.3.2. Proposed Solution

The topology adopted in the baseband receiver makes use of $N = 5$ consecutive integrating windows $T_i$ covering one pulse period $T_b$. This choice has been motivated by the fact that even at the minimum PRR of 5 MHz (200 ns pulse period), the receiver could accommodate a
near optimum integration time $T_i$ of 40 ns. Since the pulses are sent with a fixed period (no PPM modulation is used), the first task of the acquisition section mainly consists in identifying the integrator that holds the maximum absolute value\(^1\). This operation is carried out by the the “Min/Max” detector. This circuit relies on a purely combinatorial logic circuit that checks the output of each of the comparators 0-9 of the A/D section shown in Fig. 7.2. The ten outputs Comp\(\langle 0:9 \rangle\) of the comparison operations are described by the ten following tests:

\[
\begin{align*}
E_1 &\leq E_2, \quad E_1 \leq E_3, \quad E_1 \leq E_4, \quad E_1 \leq E_5, \\
E_2 &\leq E_3, \quad E_2 \leq E_4, \quad E_2 \leq E_5, \\
E_3 &\leq E_4, \quad E_3 \leq E_5, \\
E_4 &\leq E_5,
\end{align*}
\]

where $E_k$ is the output of the $k^{th}$ I&D element. If every tests involving an identical $E_k$ agree with the “Min/Max” output, the index $k$ of the corresponding window with the maximum energy is stored for further processing, as explained in the next paragraph. In the case where none of the widows could be identified as holding the maximum energy, a negative validity signal is passed, indicating that no consistent information from the comparators could be obtained.

To illustrate this decision mechanism, we assume that a positive pulse is present in the window $k = 2$, leading to a positive value $E_2$ at the output of the second I&D. All other I&D outputs $E_{k \neq 2}$ are undefined due to noise but smaller than $E_2$ (high SNR assumption).

\(^1\)Remember that, depending on the emitted bit value (frequency offset of the BFSK modulation), the demodulated pulse has a positive or negative polarity.
Consequently, the previous tests lead to the following results

$$E_1 < E_2, E_2 > E_3, E_2 > E_4, E_2 > E_5.$$ \hspace{1cm} (7.2)

All other tests result in an undetermined output due to noise. Furthermore the ”Min/Max” output is set to high, meaning that a positive pulse occurs in one of the observed integrator (“summing comparator” in Fig. 7.2). We observe from Eq. 7.2 that the comparisons results involving window $k = 2$ are consistent between each other, and are also in agreement with the “Min/Max” output. Therefore, the window $k = 2$ is considered as holding the maximum pulse energy.

### 7.3.3. Improvement for Reduced SNR

To further increase the accuracy and the reliability of synchronization during a cold start and especially at low signal-to-noise ratios (SNR), the window containing the maximum energy is searched during $M$ pulse periods $MT_b$. This method, similar to schemes achieving coding gain, is often implemented for data recovery in spread spectrum systems such as in [157], but here it is only used during initial acquisition. The need for $M = 10$ consecutive pulse periods has been determined as a trade-off between acquisition speed i.e., power consumption, and reliability of the acquisition with respect to false alarm rate (see Section 7.3.6). During the $m$th observed pulse period, the window $k_m$ identified by the ”Min/Max detector” as delivering the maximum energy is reported as a vector $z_m$ on a equivalent unit circle using vector arithmetics as shown in Fig. 7.7. The relationship between the complex number $z_m$ and the ordinal number $k_m$, which identifies the window receiving the maximum energy, can be written as

$$z_m = e^{j2k_m\pi/N},$$ \hspace{1cm} (7.3)

where $N=5$ is the total number of consecutive integrating windows over one pulse period, $k_m = 0 \ldots N - 1$ identify each of the five consecutive integration windows. For the sake of simplicity the initial window $k = 0$ is aligned with the real-axis (zero initial angle).

By accumulating vectors $z_m$ during $M$ consecutive pulse periods, we obtain a vector $L_c$, whose angle $\phi$ with the vector $k_m = 0$ of the reference window provides the shifting information $t_{id}$ used to optimally adjust the position of the “early/late” windows of the subsequent tracking algorithm. In this way, the synchronization has much better
Chapter 7: Digital Back-End and Experimental Results

accuracy than the arbitrary window position defined by the ASIC’s reset release. The angle $\phi$ relative to the window $k_m = 0$ resulting from vector accumulation is defined as

$$\phi = \angle L_c = \angle \left\{ \sum_{m=1}^{M} z_m \right\} = \angle \left\{ \sum_{m=1}^{M} e^{j2k_m\pi/N} \right\}, \quad (7.4)$$

where $M$ is the number of consecutive observed pulse periods during cold starts. Finally, the initial delay $t_{id}$ can be extracted from the angle $\phi$ by the following formula

$$t_{id} = T_b \frac{\phi}{2\pi}, \quad (7.5)$$

where $T_b$ is the pulse period (bit). Figure 7.7 illustrates this method with two simples examples at high and low SNR, illustrated in the left and right unit circle, respectively. For the sake of clarity, the number of observed pulse periods used in this example is $M = 4$ (the actual implementation of the ASIC uses $M = 10$). The reference window during a pulse period $T_b$ is identified by vector $k_m = 0$.

### 7.3.4. CORDIC Algorithm

The determination of the optimal “on-time” window position (i.e., initial delay $t_{id}$ in the time domain diagram of Fig. 7.7) requires a polar-to-cartesian conversion. Whereas the $N$ discrete values taken by vectors $k_m$ identifying each integration windows can be converted to cartesian representation by means of a small lookup table (system parameter), the sum in Eq. 7.4 can take on arbitrary values depending on the SNR and time offset between the transmitter and the receiver. A way to implement this conversion is to use the CORDIC algorithm [158].

CORDIC is generally employed when a hardware multiplier is unavailable. This algorithm essentially determines the angle $\phi$ of the complex vector $L_c$ by rotating it iteratively toward angle zero by the angle of the following complex number

$$r_k = 1 + i \cdot \frac{1}{2^{k-1}} \quad (7.6)$$

in the $k^{th}$ iteration. This rotation is done either clock- or counterclockwise depending on whether the vector $L_c$ is the upper or the lower half-plane. The discrete angles steps can be simply obtained with a division by a power of two (bit shift). The number of iterations $K$ in the
Figure 7.7. Five integration windows represented by points on the unit circle. The vector angle $\phi$ represents the calculated position of the pulse during the initial acquisition (cold start) with respect to the reference window $k_{m} = 0$. This angle corresponds to the initial delay $t_{id}$ the receiver has to wait to synchronize on the received pulse stream (the “On-Time” window is thus optimally placed on the pulse). This figure shows low- and high SNR cases and illustrates how $M=4$ consecutive pulse periods are used for the determination of the initial delay $t_{id}$ and the CORDIC length $L_{C}$. For the high SNR case, integration windows detected as receiving the maximum energy have been chosen as $\{k_{1}, k_{2}, k_{3}, k_{4}\} = \{1, 1, 0, 1\}$, while for low SNR, the example considers $\{k_{1}, k_{2}, k_{3}, k_{4}\} = \{1, 2, 0, 3\}$. 
CORDIC algorithm fixes the angle accuracy, which corresponds to the resolution $t_{res}$ of the initial delay $t_{id}$. The resolution $t_{res}$ given by the CORDIC algorithm after $K$ iterations can be approximated as follows:

$$t_{res} \approx \frac{T_b}{2^{K+1}}.$$  \hspace{1cm} (7.7)

Since no multi-phase clock has been used in this realization, the minimal physical resolution for window positioning corresponds to the inverse of the clock period $1/f_{clk}$. In this realization, the clock frequency has been fixed to $f_{clk} = 300$ MHz for a PRR of 10 MHz. This actually fixes the number of iterations required for the CORDIC algorithm to provide an accuracy better than $1/f_{clk}$. From Eq. 7.7, and from $T_b = 100$ ns and $t_{res} < 1/f_{clk} = 3.33$ ns, we thus obtain $K \geq 4$. This represents a synchronization calculation delay of 13.33 ns. This value is thus sufficiently small to adjust the “on-time” window right after the initial acquisition period without any delay due to processing. The rapid convergence of the CORDIC algorithm helps in minimizing the latency of the synchronization process without any complex parallelized circuitry.

### 7.3.5. SNR Estimation

A SNR estimate can be obtained by extracting the vector’s modulus $|L_c|$ calculated by the CORDIC algorithm. From Fig. 7.7, it becomes intuitively clear that in the absence of a signal (SNR=$-\infty$), the vectors randomly place on the unit circle and add, theoretically, to a vector $L_c$ of zero length. On the other hand, the presence of a signal (high SNR) results in the accumulation of vectors lying close together. Thus, the vector modulus calculated by the CORDIC algorithm is related to the signal quality. This information can be used to assess the SNR before entering any decoding process i.e., $|L_c| = f(SNR)$. Experimental results are given in Section 7.6.

### 7.3.6. Optimum Number of Pulses M during Cold Start

The parameter $M$ represents the number of consecutive pulses observed during the initial acquisition (cold start). It is easy to understand that locating the pulse position at low SNR becomes increasingly difficult due the adverse effect of the noise, which is superimposed on the incoming pulses. On the other hand, an excessively long duration at cold start, while providing a reliable information, reduces the efficiency of
the receiver. In this section, the optimum value for $M$ has been investigated by means of simulations. Two parameters have been considered for the determination of the number $M$ of observed pulse periods at cold start: 1) the probability of false alarm when no signal is present (AWGN only) and 2) the probability of missed detection when a useful signal is present.

The results corresponding to the situation without any useful signal (false alarm probability with AWGN only) are simulated in Fig. 7.8 for three different values of $M$. We also assume that the threshold that defines a valid signal is arbitrarily set to 66% of the maximum length of $|L_c|$ (inner red circles in Fig. 7.8). An initial synchronization delivering a CORDIC length $|L_c|$ above this threshold is considered as valid; in this case, the receiver is set in tracking mode to decode subsequent data.

We observe that, for $M=3$ consecutive pulse periods (Fig. 7.8-a), the false alarm rate is approximately 30%. This is obviously not sufficient to obtain a reliable detection when listening to the noise-only channel. On the other hand, choosing $M=30$ (Fig. 7.8-b) reduces the false alarm rate to 0% (result obtained for 1200 experiments), but increases the time during which the receiver must be completely activated. As shown in Fig. 7.8-c), $M=10$ corresponds to the minimum number of pulse periods that must be observed to obtain a false alarm rate below 2%. This is considered as a good trade-off between the cold start duration (i.e., the power consumption) and the reliability of the detection.

The same simulations are conducted in the presence of a signal to determine the probability of missed detection (Fig. 7.9). The SNR used for the experiment is equivalent to the expected sensitivity of the receiver in free-space. This corresponds to a $E_b/N_0$ of 12 dB for a BER smaller than 1% (see Fig. 7.15). We observe that for $M=10$, more than 90% of the signal are correctly detected. This corresponds to a probability of missed detection smaller than 10%.

A summary of the simulated probabilities of false alarm and missed detection for a SNR going from $E_b/N_0 = -\infty$ to $E_b/N_0 = 16$ dB is given in Fig. 7.10-c) for the three considered values of $M$. For comparison purpose, the BER and the length of the CORDIC vector $|L_c|$ are plotted in the same figure, in subplots a) and b), respectively. The probability of false alarm increases drastically for decreasing SNR if $M$ is too small (red square markers). A minimum value of $M=10$ (green round markers) is required for a reliable synchronization during cold start. Furthermore, the CORDIC length $|L_c|$ and the rate of valid de-
Figure 7.8. Determination of the optimum value $M$ in AWGN only by means of simulations of the CORDIC length $|L_c|$: a) $M=3$, b) $M=30$ and c) $M=10$. The inner red circle defines the threshold above which the synchronization is considered as valid (66% of $|L_{c,\text{max}}|$).
7.3. The Digital Section: Synchronization Algorithm

Figure 7.9. Determination of the optimum $M$ at the expected sensitivity level of $E_b/N_0 = 12$ dB. The points show the simulated values of the CORDIC vector $L_c$ for a) $M=3$, b) $M=30$ and c) $M=10$ consecutive periods. The inner red circle defines the threshold above which the synchronization is considered as valid. For $M=10$ the probability of missed detection goes below 10%.
tection are increasing monotonically for values of $M$ larger than 10. Therefore, the CORDIC vector length $|L_c|$ can be used unequivocally to assess the quality of the received signal.

### 7.3.7. The Tracking Algorithm, “Early/Late” Revisited

Once initial synchronization is achieved, the pulses are tracked more precisely by a modified “early/late” algorithm. The shift of the windows are triggered by “Early$>$On-Time” and “Late$>$On-Time”, the comparison values are provided by the signal ‘OnTime’ and comparators ’7’ and ’9’ of Fig. 7.2, respectively. A reliable tracking mode is vital for channels having a reduced multipath distortion. Since the pulse dispersion is small in this case, the early and late windows, which deliver each one bit of information through the comparators, do not contain sufficient energy to provide reliable information on how the tracking algorithm must react. A conventional “early/late” algorithm used in conjunction with an I&D detection is thus inefficient for IR-UWB signals propagating through single path channels without severe multipath reflections. The proposed solution is to let the integrating windows slightly overlap each other, such that the “early” and “late” windows contain part of the short pulse energy. That way, the “early” and “late” integrators provide sufficient signal and the “on-time” window can slide unambiguously towards the optimal position. The drawback of this modification is a slightly slower settlement. The comparison of overlapping windows provides less feedback signal to the tracking loop (signals ‘WinCtrl<2:4>’ in Fig. 7.5) than a comparison of non-overlapping window. Nevertheless, this approach works fine after completed coarse acquisition during the cold start phase and can track relatively large clock offsets (see Section 7.8).

### 7.3.8. Timing

Figure 7.11 shows the timing of a complete synchronization sequence, including initial acquisition ($t < 2030$ ns), initial delay setting and tracking mode ($t > 2083$ ns). On top of this figure, a typical simulated signal $V_{dem}$ as provided by the demodulator is shown (black curve). The main digital signals controlling the analog circuit are sketched in the lower section. These signals follow a pattern that meets the timing requirement to properly control the I&D elements and switch the comparators to latch mode. The “T” marks the moment when the
Initial synchronization (2000 experiments)

Figure 7.10. Optimum choice of $M$ for SNR from $-\infty$ to 16 dB: a) Theoretical BER in free space; b) length of the CORDIC vector $|L_c|$; c) probability of valid detected signals for SNR from $-\infty$ to 16 dB; the false alarm rate (cold start in AWGN only, leftmost markers) goes below 1% for $M$ larger than 10, thus resulting in a monotonically increasing curve (round and diamond marker), similar to $|L_c|$ in b).
Chapter 7: Digital Back-End and Experimental Results

digital section reads the comparators’ states. The reading occurs only when the comparators are in latch mode to avoid metastable states.

The initial acquisition starts nine clock cycles (30 ns) after a reset. In this example, the window containing the maximum energy is the dark-gray shaded window #3. Note that choosing the window having the maximum energy does not necessarily mean a perfect alignment to the signal’s main peak, especially if the signal is corrupted by multipath distortion as illustrated in this figure. After 2030 ns (2 \mu s at 10 Mp/s), the cold start phase is completed. Assuming that all observed pulse periods have produced the same results (high SNR case), window #3 will be chosen as the “on-time” window containing the maximum energy for the subsequent tracking mode. The calculation of the initial clock delay yields an equivalent $t_{\text{id}}$ of 60 ns, this value includes the calculation delay of the CORDIC algorithm. The initial delay $t_{\text{id}}$ thus locates the 5 ns long pulse over the 100 ns pulse period. The “on-time” window is then positioned to $t = 2090$ ns (light-gray section). A magnified version in the second tracked pulse period ($t = 2190$ ns) is depicted in the right-hand side of Fig. 7.11.

During the tracking phase, there is a fixed pattern, which is used to control the analog section, as well. The main difference is that three overlapping windows #2, #3 and #4 are generated for the modified “early/late” algorithm instead of five consecutive ones.

7.4. Baseband ASIC Power Consumption

During the acquisition phase, both analog and digital parts of the detection circuit are fully powered and require 9 mW and 4 mW, respectively. Values are given here for a PRR of 5 MHz ($f_{\text{clk}} = 150$ MHz). The demodulation section, which comprises the transconductance-based LPF and HPF and two Gilbert-cells for the multiplication, uses a fully differential topology and consumes 2.9 mW continuously. The five integrators consume an average of 4.6 mW, but most of this power is used by a current mirror buffer (not shown) between the demodulator and the integrator’s input, which serves as an output buffer to provide a test signal for measurement purposes. The comparator bank requires about half a milliwatt.

During the tracking phase, the two remaining integrators corresponding to windows #0 and #1 are no longer used. The analog-to-digital section thus makes use of only three comparators for the “early/late” algorithm and all the digital circuitry associated with these
Figure 7.11. Timing diagram of the overall synchronization for a PRR of 10 MHz ($f_{clk} = 300$ MHz).
windows can be completely switched off to reduce the power consumption of the ASIC by approximately 50%, down to 6.5 mW (5.5 mW for the analog section).

A chip photograph of the implemented mixed-mode ASIC and a block schematic are given in Fig. 7.12.

Figure 7.12. Chip photograph of the IR-UWB baseband receiver ASIC and high-level schematic illustrating the ASIC’s functional blocks and the main input and output signals. The test chip is pad limited, half of the chip’s active area is occupied with decoupling capacitors. The active part, including analog and digital cores, occupies less than 0.25 mm$^2$. 
7.5. BER Measurements

The BER measurements have been conducted with a Tektronix CSA907T/R test set, as illustrated in Fig. 7.13. The RF IR-UWB signals have been generated at a PRR of 5 MHz by means of the IR-UWB transmitter presented in [159] and described in details in Chapter 4. This PRR ensures a clean and stable signal after center frequency settling during BFSK modulation. The transmitted signals are illustrated for channel A between 3.2 and 3.7 GHz in Fig. 7.14-a.

**Figure 7.13.** Simplified schematic of the BER test setup. The channel is implemented by a 20-70 dB attenuator and represents free-space conditions.

Since the synchronization mechanism does not show any dependence on the data, a pseudo-random data sequence could be employed, even during the acquisition periods. The BER plot of the complete communication link is shown in Fig. 7.15. For the measurements, an integration time of 40 ns has been used. As described in Section 7.2.3, this integration time is a good trade-off between the optimum integration times of the six considered indoor channel models. The receiver and the transmitter are connected with a coaxial cable and a variable 1-dB step attenuator. The measured transmitter output power is -20.5 dBm and the attenuation is swept from 20 to 70 dB. This configuration mimics a free space communication range going from less than 10 cm to more than 20 m. The free space condition also represents a worst case scenario for both the I&D detection method and the tracking algorithm.
due to the sharpness of the received pulsed signal.

On the left y-axis of Fig. 7.15, the measured BER result (round markers fitted by an ‘exponential+constant’ continuous curve) is compared with a simulation using the same integration length (dashed curve) and assuming a perfect synchronization. The measured sensitivity level for a $10^{-2}$ BER, a common reference value in the field of wireless communications, is -83.7 dBm (dashed marker lines). This corresponds to an equivalent $E_b/N_0$ of 15.3 dB, which is obtained by an estimated overall front-end’s noise figure of 8 dB. Approximately 7.5 dB are contributed by the RF front-end IC (Channel A centered around 3.45 GHz, see Fig. 6.21) and an additional 0.5 dB comes from the VGA chain. At this BER level of $10^{-2}$, the equivalent free space attenuation is 63.2 dB and corresponds to a communication range of 9.85 m. This value is equivalent to the targeted value of 10 m specified in Section 3.1; only the targeted bit rate of 10 Mb/s could not be reached on the realized transmitter test board due to a limitation of the off-chip driving circuit. This is the reason why the characterization of the link has been realized at 5 Mb/s.

The measured input power level $P_{RX}$ and its equivalent SNR are reported at the top and the bottom of the y-axis, respectively. At $10^{-2}$ BER, the discrepancy between the simulation and measurements is 3.5 dB (a factor 1.5 on the free space distance). Approximately 1.5 dB can be explained by the implementation losses, such as VCO phase noise and spurs (approx. 0.5 to 1 dB) and mismatch originated from the separate IC integration (wire-bonded chips) of the RF front-end, the baseband ASIC and the PLL synthesizer generating the LO. The remaining difference of 2 dB accounts for the effect of the tracking algorithm, which constantly monitors the position of the pulse during BER measurements. This tracking algorithm results in a jitter of 13 ns on the integration window position, as illustrated in Fig. 7.16. The same effect is responsible for the lower BER bound observed below $10^{-6}$ at high SNR above 30 dB.

7.6. SNR Estimate Through CORDIC Vector Length

7.6.1. Measurements

The square markers referring to the right axis of Fig. 7.15 show the averaged values of the CORDIC modulus $|L_C|$ calculated during the initial acquisition (cold starts). $|L_C|$ is provided by the ASIC through a 5-bit digital word (“CordicMod” output in Fig. 7.5). The average
Figure 7.14. Measured RF IR-UWB transmitted signals at the output of the hybrid coupler (Fig. 7.13) in frequency- (a) and time domain (b and c) for channel A between 3.2 and 3.7 GHz. The PRR is 5 MHz, the pulse stream is modulated by a binary FSK scheme and results in a composite PSD characterized by a -10 dB bandwidth slightly larger than 500 MHz.
Figure 7.15. Measured BER performance of the demodulation and prediction of the wireless link quality; for a BER better than $10^{-2}$, the measured CORDIC length must be larger than 7, which corresponds to a measured received signal power of -83.7 dBm.

value shown here is obtained over 20 cold starts. The figure shows that to ensure a BER better than $10^{-2}$ $|L_C|$ has to be larger than 7 in our implementation.

Using the latter value as a threshold to decide whether the receiver shall try to decode the subsequent pulse stream (e.g. frame synchronization and payload data), the probability of false detection in the absence of a useful signal has been measured around 13%. On the other hand, when considering all the measured $|L_C|$ corresponding to a signal level above the sensitivity level of -83.7 dBm, the probability of missed detection is better than 20%. These probability values have been obtained over a single CORDIC evaluation and has not been averaged as it is the case for the values shown in Fig. 7.15. Thus, the value of the CORDIC
length provided during the acquisition is a trustworthy assessment of the signal quality. This assessment can be improved by the monitoring of the control voltage of the VGA or by increasing the observation time during acquisition.

### 7.7. BER Measurements with Free-running VCO

It has already been demonstrated \[160\] that the required center frequency accuracy of UWB signals is less stringent owing to the large bandwidth occupancy. In our topology, the frequency deviation $f_o$ for the BFSK modulation is $\pm 150$ MHz and allows up to 20 MHz of frequency offset between the receiver and the transmitter with negligible loss in performance. This relaxed accuracy in frequency allows us to use a calibrated free-running oscillator instead of a PLL-based device. The main motivation is a drastic reduction of the power consumption for the frequency generation device in both transmitter and receiver.
A calibration method to set the frequency of a free-running VCO is proposed in Appendix D.

Figure 7.17 shows measurements of the receiver’s sensitivity degradation caused by an offset in the LO frequency from 0 to 50 MHz. These results have been compared with a simulation and exhibit a similar behavior. The measured point at zero offset considers a PLL-based configuration and corresponds to the signal level of -83.7 dBm highlighted in Fig. 7.15, the half decibel discrepancy occurring with the free-running VCO is caused by the inherently increased close-in phase noise of such a configuration. On the other hand, the selectivity of the receiver is increased owing to the absence of reference spurs in the LO output spectrum, which usually occurs with a conventional PLL [92].

![Figure 7.17. Receiver sensitivity degradation due to free-running VCO. At small frequency offsets smaller than 20 MHz between the receiver and the transmitter, the use of a calibrated free-running VCO degrades the sensitivity by only 0.5 dB. Such a technique helps in reducing the overall power consumption by avoiding the overhead time needed for the PLL to lock.](image)

### 7.8. Clock Offset Sensitivity

The synchronization method used in this IC is insensitive to clock $f_{\text{clk}}$ mismatch between the receiver and the transmitter, since it relies on the integration of a short 5 ns pulsed signal over a 40 ns period. During the decoding process, a continuous adjustment of the “on-time” integrating
window is performed. The baseband ASIC sensitivity to clock offsets has been investigated. The measurements show that the BER stays below $10^{-3}$ for a clock offset of up to $\pm 0.8\%$ between the transmitter and the receiver.

This weak dependence of the BER performance on a clock frequency offset is suitable for a class of wireless applications, which is not based on accurate crystal time bases [161]. A commercially available CMOS clock generator yields a $\pm 0.5\%$ total frequency accuracy over a $0 - 70^\circ C$ temperature range and $\pm 5\%$ voltage range without using ceramic resonators, quartz crystals or other external components [162]. Frequency variations comparable [163] and smaller [164] than $\pm 0.8\%$ over a relatively large supply voltage variation and temperature range between and $-20$ and $100^\circ C$ have been reported in the literature for crystal-less oscillators using compensated on-chip low-power CMOS ring oscillator. Reference [163] also proposes an adaptive biasing scheme incorporating a process corner sensing scheme to compensate for process variations, which normally adds variations in the order of $\pm 20\%$. [163] shows that the resulting $2\sigma$-variations in frequency between different wafer runs could be brought down to $\pm 2.6\%$, without any trimming, including temperature and supply voltage variations (the compensated process variation typically contributes to $\pm 2.1\%$ in the overall variations).

### 7.9. Comparison with Other Systems

A direct comparison of the obtained results with similar works shortly reviewed in the next paragraphs [153–155, 165, 166] is difficult due to the different system design parameters, such as data rate, modulation, coding and availability of channel estimate mechanisms. Nevertheless, a summary and a comparison of the different solutions is given in Table 7.1.

1) Reference [165] describes a direct-sequence baseband IR-UWB processor designed for BPSK modulation with a spreading factor of 31. The 300 MHz pulses ($t_p \approx 3.3$ ns) are two times oversampled at 1.2 Gsps. The chip features a highly parallelized architecture to enable a low frequency clock, as low as 30 MHz in the digital back-end. The detection is achieved by a correlator bank using rectangular windows, meaning that no channel estimate is performed.

2) Reference [166] reports on the realization of a UWB system having a input bandwidth $BW_{in}$ of 1 GHz and using direct-sequence
spread spectrum with variable spreading factor from 1 to 1024. At the maximal spreading factor (61 kb/s), the system achieves more than 20 dB of SNR margin for a 10 m transmitter-receiver separation, operating under the UWB FCC regulation.

3) The topology used in [155] targets very low power consumption and has to be used with a separate analog front-end featuring quadrature analog correlation (QAC) and implementing two 4-bit ADC running at the data rate. This digital back-end is intended for sensor networks communicating at bit rates up to 50 Mb/s. Lower bit rates can benefit from processing gain but no channel estimate is available. A low clock speed at twice the PRR is used and power consumption is dynamically traded by using configurable digital modules. The results shown in Table 7.1 are obtained with a spreading factor of 15 at 40 Mp/s. No mention in the text has been found about the BER performance. Note that this implementation does not include the analog correlation and the ADC.

4) Reference [153] presents an IR-UWB transceiver based on M-ary bi-orthogonal keying (MBOK) modulation between 3 and 5 GHz. The system is designed for indoor wireless communications up to 100 Mb/s. A full-digital baseband RAKE receiver is adopted for multipath effect cancellation.

5) In [154], several hardware solutions, such as power gating have been adopted to drastically reduce the power consumption. The realized circuit features a highly parallelized architecture. It has been implemented in a 90 nm CMOS technology and can be operated at very low supply voltage (0.4 V). This solution enables also channel estimate (5-finger RAKE receiver).

7.10. Summary and Conclusions

The realization and the characterization of a low-power mixed-signal baseband ASIC is described in this Chapter. A summary of the complete receiver chain, including the RF front-end, the VGA and baseband ASIC, is given in Table 7.2. The two first circuits have been described in the previous Chapter. The baseband device features a BFSK demodulator, a non-coherent signal detector based on an integrate-and-dump function and a bit-level synchronization mechanism. A synchronization algorithm for low-complexity and LDR IR-UWB receivers has been
**Table 7.1.** System Comparison of Existing IR-UWB Baseband Processors

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>[165]</td>
<td>BB 3.3</td>
<td>1.2 (4b)</td>
<td>BPSK</td>
<td>0.6</td>
<td>70</td>
<td>6</td>
<td>180-1.8</td>
<td>300</td>
<td>388n</td>
<td>-18.3 (193 kb/s)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>[166]*</td>
<td>BB -</td>
<td>Polar 1</td>
<td>Polarity</td>
<td>2 (1b)</td>
<td>RAKE $\Diamond$</td>
<td>131</td>
<td>62.5</td>
<td>180-1.8</td>
<td>163</td>
<td>252p</td>
<td>22.0 (62.5 Mb/s)</td>
<td></td>
</tr>
<tr>
<td>[155]**</td>
<td>CB 2-0.5</td>
<td>BPSK 0.5-2</td>
<td>PRR (4b)</td>
<td>QAC $\Diamond$</td>
<td>19</td>
<td>40</td>
<td>130-0.95</td>
<td>80</td>
<td>700p $\dagger$ (1n $\ddagger$)</td>
<td>N/A (2.66 Mb/s)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>[154]*</td>
<td>CB 2</td>
<td>BPSK 0.25</td>
<td>0.5 (5b)</td>
<td>-</td>
<td>RAKE $\Diamond$</td>
<td>-</td>
<td>100</td>
<td>90-0.4</td>
<td>25</td>
<td>20p</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>[153]</td>
<td>CB -</td>
<td>MBOK 0.5</td>
<td>1 (4b)</td>
<td>RAKE $\Diamond$</td>
<td>-</td>
<td>100</td>
<td>180-1.8</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
<td></td>
</tr>
<tr>
<td>THIS WORK</td>
<td>CB 5</td>
<td>BFSK 0.25</td>
<td>PRR (1b)</td>
<td>I&amp;D $\Diamond$</td>
<td>$\frac{20}{PRR}$</td>
<td>1-20</td>
<td>180-1.8</td>
<td>30-PRR</td>
<td>1.3n $\dagger$ (2.6n $\ddagger$)</td>
<td>15.3 (5 Mb/s)</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

abbrev. : BB = baseband, CB = carrier-based (quadrature signals); PRR = pulse repetition rate; $\Diamond$/ $\Diamond$ without/with channel estimate; * without ADC; ** without analog correlation and ADC (additional 1 nJ/b is estimated from [167]); $\dagger$ tracking mode, $\ddagger$ acquisition mode.
Table 7.2. Receiver performance summary

<table>
<thead>
<tr>
<th>Baseband ASIC</th>
<th>0.18 μm CMOS / 1.8 V</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology/Supply voltage</td>
<td>Baseband IC area 1.5 x 1.5 mm</td>
</tr>
<tr>
<td>Active area</td>
<td>Clock frequency 0-600 MHz (= 30 x pulse rate)</td>
</tr>
<tr>
<td>Clock frequency</td>
<td>0.25 mm²</td>
</tr>
<tr>
<td>Flip-flop count</td>
<td>101</td>
</tr>
<tr>
<td>Bit rate</td>
<td>1-20 Mb/s (1 pulse per bit)</td>
</tr>
<tr>
<td>Demodulation</td>
<td>Quadricorrelation</td>
</tr>
<tr>
<td>Detection</td>
<td>Non-coherent, I&amp;D</td>
</tr>
<tr>
<td>Energy per acquisition</td>
<td>70 nJ (20 bits at 5 Mb/s)</td>
</tr>
<tr>
<td>Energy per decoded bit</td>
<td>1.3 nJ (5 Mb/s, 1.8V)</td>
</tr>
<tr>
<td>SNR at BER $10^{-2}$</td>
<td>$E_b/N_0 = 15.3$ dB, uncoded</td>
</tr>
</tbody>
</table>

Full receiver power consumption (tracking mode, 5 Mpulses/s)

<p>| | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>RF front-end (LNA and I/Q mixers)</td>
<td>14.5 mW</td>
</tr>
<tr>
<td>LO (free-running VCO, duty-cycled)</td>
<td>8.5 mW</td>
</tr>
<tr>
<td>VGA (with ACG)</td>
<td>5.4 mW</td>
</tr>
<tr>
<td>Bias (CMFB, replica)</td>
<td>1 mW</td>
</tr>
<tr>
<td>Baseband</td>
<td>6.5 mW</td>
</tr>
<tr>
<td><strong>Total</strong></td>
<td><strong>35.9 mW (7.2 nJ/b)</strong></td>
</tr>
</tbody>
</table>

developed, simulated, implemented on a chip and verified. Part of the results given in this Chapter have been published by the author in [168].

Our circuit is fabricated using a 0.18 μm CMOS technology and has been tested with the entire RF front-end described in Chapter 6. Measurements with the front-end result in a BER curve that translate into a receiver sensitivity of -83.7 dBm. This value roughly corresponds to a communication distance of 10 m in free-space without any error correction nor any other coding schemes. The power consumption of the realized baseband ASIC during the acquisition of IR-UWB signal is 13 mW at 5 Mb/s, while it only requires 6.5 mW in tracking mode.
In this dissertation, an entire UWB wireless link has been developed. From system specifications to the characterization of a link using UWB signal characteristics, several issues have been addressed throughout this work.

The first main goal of this thesis was the investigation of the new possibilities enabled by the large bandwidth offered by the amendment of the emission rules for UWB signalling. We investigate the benefits and drawbacks of UWB for the realization of low-power, low-complexity, small form factor but robust wireless transceivers. The first benefit is the wide available bandwidth that enables a relaxed frequency accuracy and sufficient diversity to mitigate the adverse effect of multipath fading. However, this ill-defined frequency accuracy comes firstly at the cost of bandwidth efficiency. But contrary to narrowband systems that have to cope with very limited bandwidth resources, UWB wireless systems can advantageously employ the excess bandwidth to increases link robustness. On the other hand, the large bandwidth exposes UWB systems to spurious signals of other systems or its own digital baseband.

8.1. IR-UWB Transmitters

By using ill-defined signals in the frequency domain and by exploiting the duty-cycled characteristic of pulsed signals, it has been first demonstrated that very low power consumption can be achieved by means of a free-running LC oscillator that can be completely switched off between pulse transmission (see [86] and Appendix B). Unfortunately, the lack
of accuracy of the generated signals both in frequency and spectrum prevents the use of such a solution in practical systems.

When using low-complexity UWB transmitters, the complexity of the link has to be transferred to the receiver. The latter has to cope with inaccurate frequencies and needs more time to detect signals. An additional issue is the lack of selectivity, which makes the receiver prone to interferers. Moreover, and especially with respect to the tight European regulation, it is very difficult to obtain a controllable spectrum without additional filtering. Filtering methods have also been investigated under the form of an output stage featuring a bandpass pulse shaping filter in frequency domain (see [87] and Appendix C). This kind of circuit can be used as an output stage for low-complexity transmitter generating ill-defined UWB pulses. However, the limited Q-factor of on-chip inductors prevents tight filtering and limits the application of such circuits to single-band UWB devices under FCC regulation. The stringent emission masks of the European regulation requires more complex solutions involving signal generation using higher-order filters at baseband and up-conversion (carrier-based solution).

The accuracy needed to meet the European regulation in terms of frequency spectrum does not require the use of PLL devices. It has been demonstrated in this work that a frequency accuracy in the order of one percent is needed to meet the European regulation. This level of accuracy can be reached with a calibrated free-running voltage-controlled oscillator (see [92] and Appendix D). The calibration method still requires the implementation of a PLL to set the control voltage of the VCO to the desired value, but the calibration procedure has only to be repeated every couple of hundreds of milliseconds to compensate for potential drifts caused by temperature and supply voltage variations. The advantage of this method clearly resides in the self-calibration ability of the carrier that does not require any manual tuning procedures (i.e. laser trimming) to compensate process variations.

An on-chip PLL has also been developed. It features a single wideband VCO that covers the UWB band between 3 and 5 GHz over temperature and process variations. The PLL meets the requirements needed for the aforementioned self-calibration and, more specifically, has been designed for direct frequency modulation of the transmitted IR-UWB pulses (Chapter 5). It achieves binary frequency shift keying modulation up to 10 Mb/s. The modulation rate has been specified in accordance with the ability of the channel to cope with inter-pulse interference (Chapter 3), as summarized in the next section.
Finally, a single-chip IR-UWB transmitter using BFSK modulation and offering three channels between 3.3 and 4.8 GHz has been implemented (Chap. 4). It uses the implemented PLL in closed loop but effective power reduction schemes have been investigated for the up-conversion section and the output stage.

### 8.2. IR-UWB Receiver

On the receiver side, the complete system has been first specified in Chapter 3. This in-depth study shows that the limitations of low-complexity IR-UWB wireless links are mainly imposed by the regulations (Chapter 2) and the characteristics of the propagation channel, such as inter-pulse interferences. In this work, no equalization or adaptive digital post-processing is employed. The signal detection is simply based on the energy collection principle. Only the integration length and the pulse repetition rate can be matched to different indoor channel scenarios. Based on the channel considerations, a modulation scheme based on binary frequency shift keying has been proposed. This scheme eases the implementation of a direct-conversion receiver [168].

The RF front-end described in Chapter 6 features a 3-5 GHz UWB LNA, which feeds quadrature mixers for frequency down-conversion. A variable gain amplifier [169] has also been developed specifically for our application. It provides 60 dB of voltage gain over a bandwidth of 180 MHz. This circuit also features an automatic gain control that compensates transmitter-receiver range variations and multipath fading effects.

The back-end section consists of a mixed-mode integrated circuit (Chapter 7), whose function is to demodulate the down-converted quadrature signal into a bipolar pulsed signal. The analog signal is then integrated over a duration of 40 ns after the "energy-collection" principle and converted into a digital signal. Thanks to the bipolar characteristics of the demodulated signal, the analog-to-digital conversion can simply be based on comparators without threshold setting (1-bit A/D conversion). The digital section of the baseband chip also synchronizes on the incoming pulse stream and provides the digital data. The baseband processor is also able to provide an estimation of the signal-to-noise ratio without any information from the received signal strength. This enables an estimation of the link quality prior any further processing or data storage. This assessment is based on the monitoring
of the incoming signal over several pulse periods with the help of a
CORDIC algorithm.

The entire IR-UWB link exhibits a communication range of 10 m at
5 Mb/s in free-space without any error correction nor any other coding
schemes. This corresponds to a receiver sensitivity of -83.7 dBm for a
bit error rate of $10^{-2}$. The power consumption of the entire receiver in
tracking mode (after synchronization) at 5 Mb/s reaches 36 mW. This
 corresponds to a energy consumption of 7.2 nJ per decoded bit.

8.3. Outlook

There are various challenging issues that await exploration of IR-UWB
transceiver in future research. First of all, the different integrated cir-
cuits realized in this work have to be merged on a single IC. This
would “only” require an additional design effort since these devices have
been developed with this goal in mind. For instance, they all use fully
balanced implementation, same power supply and CMOS technology.
Considering a single chip transceiver implementation, an on-chip UWB
transmit-receive (T/R) switch is needed after the antenna to isolate the
transmitter output from the LNA input and vice versa. Such a device
further increases the noise figure of the receiver due to its insertion
loss and is therefore challenging, especially when realized in CMOS. A
second issue that has not been completely explored in this work is the
implementation of the channel filter on the receiver side. This channel
filter is based on the same topology than the one used for the pulse
shaping filter and thus, could be re-used with minor modifications.

Some ideas proposed in this work can be pushed further ahead. For
example, using the proposed self-calibrated VCO, the number of PLL
frequencies required for the calibration can be reduced to less than the
used frequency set. The complete set of frequencies can be generated
by interpolating between effectively calibrated frequencies. This would
ease the design of the calibration PLL (N-integer) and reduce the time
needed to calibrate the VCO. Note that this calibration method has
not been used in the single-chip transmitter for the BFSK modulation.
A potential improvement would be the implementation of a modulation
circuit using a dual-DAC device to modulate the VCO at higher rates.
This will also help to reduce the power consumption of the transmitter
section by about one third.

From a system point of view, the 3-5 GHz band is clearly no longer
the most promising band for UWB applications. Nowadays in 2010,
it seems that the preferred band lies within the 6.5-8 GHz frequency range, owing to its worldwide availability. Therefore, on the receiver side, a redesign of the receiver front-end (LNA and mixer) is required. The output stage and the up-conversion mixer also needs redesign. Regarding blocks common to Tx and Rx, the VCO and the PLL divider must be replaced. Designing an IR-UWB transceiver using the same 0.18 µm technology would probably be very challenging, especially in terms of power consumption of the analog front-end. Nevertheless, low power is still possible with the use of a more advanced 90 nm CMOS process using lower supply voltage down to 1 V.
A

Equations

A.1. Influence of $C_{gd}$ on Input Q-factor of the LNA

This calculation refers to Fig. 6.5 and results in Eq. 6.12. $C_{gs}$ and $L_s$ form a series circuit expressed by the reactance $X_g$ or the susceptance $B_g$

$$X_g = \omega L_s - \frac{1}{\omega C_{gs}}, \quad B_g = \frac{\omega C_{gs}}{\omega^2 C_{gs} L_s - 1},$$

(A.1)

we define $Q_g$ the quality factor associated with this reactance in series with the resistive part formed by $\omega_T L_s$:

$$Q_g = \frac{X_g}{\omega_T L_s}.$$  

(A.2)

We transform the series resonant network $Z_g = \omega_T L_s + jX_g$ into its equivalent parallel resonant network (note that this transformation is only valid around a single frequency, chosen as $\omega = \omega_c$, which is the center frequency of the band of interest), we obtain

$$R_{[\text{par}]} = \omega_T L_s (1 + Q_g^2),$$

(A.3)

$$X_{g[\text{par}]} = X_g (1 + Q_g^{-2}), \quad B_{g[\text{par}]} = B_g \frac{Q_g^2}{1 + Q_g^2}.$$  

(A.4)
We add the susceptance due to Miller effect, i.e. \( B_M = -\omega_cM\alpha C_{gs} \) to \( B_g \text{[par]} \) to obtain the overall susceptance

\[
B_g^{[\text{par}]} = B_g^{[\text{par}]} + B_M
\]

\[
B_g^{[\text{par}]} = \frac{\omega_c C_{gs}}{\omega^2 C_{gs}L_s - 1} \cdot \frac{Q_g^2}{1 + Q_g^2} - \omega_cM\alpha C_{gs} \quad \text{(A.5)}
\]

\[
\approx B_g^{[\text{par}]} \left[ 1 + M\alpha C \left( 1 + \frac{1}{Q_g^2} \right) \right]. \quad \text{(A.6)}
\]

The parallel resistance and reactance with the Miller effect are

\[
R_g^{[\text{par}]} = R_g^{[\text{par}]} = \omega_T L_s (1 + Q_g^2) \quad \text{(A.9)}
\]

\[
X_g^{[\text{par}]} = \frac{1}{B_g^{[\text{par}]}} = \frac{X_g(1 + Q_g^{-2})}{\left[ 1 + M\alpha C \left( 1 + \frac{1}{Q_g^2} \right) \right]} \quad \text{(A.10)}
\]

We transform back the parallel circuit into a series resonant network \( Z_g' \) whose quality factor is \( Q_g' \) is (Eq. 6.12)

\[
Q_g' \approx Q_g \cdot \left[ 1 + M\alpha C \left( 1 + \frac{1}{Q_g^2} \right) \right].
\]

The validity of this approximation is ensured by the fact that the common-source stage is only lightly degenerated and the nature of the impedance seen at the gate of the transistor is capacitive.

### A.2. Common-source Transconductance \( G_m \)

The following equations refers to Fig. 6.6. The transconductance \( G_m \) of the common-source stage, including the input resonant network, can be written as

\[
G_m = \frac{i_d}{v_G} = \frac{g_{m1}v_{gs1}}{v_G}, \quad \text{(A.11)}
\]

where

\[
v_{gs1} = \frac{i_g}{j\omega C_{gs1}}, \quad i_g = i'_{gs} \frac{1}{j\omega C_M} + Z_g \quad \text{and} \quad i'_{gs} = \frac{v_G}{R_G + j\omega L_g + Z_g \| \frac{1}{j\omega C_M}}.
\]

From the above expression, the transconductance \( G_m \) can be rewritten as

\[
G_m = g_{m1} \cdot \frac{1}{j\omega C_{gs1}} \cdot \frac{1}{j\omega C_M + Z_g} \cdot \frac{1}{\frac{R_G}{1 + M\alpha C} + \frac{1}{j\omega C_M}} \quad \text{at resonance and Z-match (A.12)}
\]
A.2. Common-source Transconductance $G_m$

The second fraction in the previous equation comes from the current divider formed by $C_M$ and $Z_g$ and can be replaced by $(1 + M\alpha C)^{-1}$; this transfer function, which defines the amount of current flowing into $Z_g$, is a second order low-pass filter having a corner frequency of

$$\omega_M = \sqrt{\frac{1 + M\alpha C}{L_s C_M}} = \sqrt{\frac{1 + \frac{1}{M\alpha C}}{L_s C_{gs1}}} \approx 20 \text{ GHz.} \quad (A.13)$$

The last fraction is originated from the input current $i'_g$ of the LNA. At resonance $\omega = \omega_c$ and for perfect impedance match, this expression can replaced by $1/(2R_G)$. In this case, the resulting transconductance is simplified as

$$G_m(\omega_c) \approx g_m \frac{1}{j\omega_c C_{gs1}} \cdot \frac{1}{1 + M\alpha C} \cdot \frac{1}{2R_G} \quad (A.14)$$

$$\approx \frac{\omega_T}{j\omega_c} \cdot \frac{1}{1 + M\alpha C} \cdot \frac{1}{2R_G}. \quad (A.15)$$
Low-power UWB Wavelets Generator

Abstract\textsuperscript{1} - A low-power fully integrated ultra-wideband (UWB) wavelet generator is presented. This UWB generator is intended for low-power and low-complexity UWB radio technology, using the non-coherent energy collection approach. The wavelet generator is based on a cross-coupled inductance-capacitance (LC) oscillator. It can be directly driven by two digital signals, which can modulate the length, the position and the phase of the output wavelet. An additional digital circuit improves the start-up time of the oscillator so that the oscillator and the output buffers can be switched off between each wavelet generation. The entire chip - including output buffers - uses a 0.18 µm CMOS technology. When operating at 10 megapulses per second (Mp/s) with a 1.2 GHz bandwidth wavelet, the generator provides a typical average output power of -20 dBm and consumes only 1.8 mW. The differential output signal is a multicycle waveform centered at 4.5 GHz.

B.1. Introduction

Besides the solutions proposed by industrial consortia, which mainly aim at transmission rates on the order of 100 Mbits/s or more, UWB is also seen as an interesting technology for new emerging communication and location tracking radio systems with very low power consumption (typically in the range of a few nanojoules per transmitted bit),

\textsuperscript{1}copy of reference \cite{86}, with updated references.
low complexity and high integration level (i.e., an antenna, a quartz-based time reference and an energy source are the only off-chip components). For example, several such solutions propose receivers based on the energy collection approach [170] [171] to implement low data rate UWB (UWB-LDR) radio links (typically between 0.1 - 10 Mb/s). These non-coherent approaches avoid the need for implementing sophisticated channel estimations schemes, while compromising reduced sensitivity, noise and interference rejection with the advantage of simplicity and very low power consumption. These solutions are mainly intended for ultra low-power Wireless Personal Area Networks (WPAN), Wireless Sensor Networks (WSN) [172], Wireless Body Area Networks (WBAN) [53] and radio-frequency tags [173].

The circuit presented in this Appendix is a fully integrated implementation of an UWB wavelet generator intended for low-power and low-complexity UWB radio systems based on non-coherent demodulation schemes.

Figure B.1 illustrates a typical UWB front-end architecture using the energy collection approach. The transmitting chain (Tx) consists of an integrated wavelet generator, which sends the signal through a transmit/receive switch (T/R) followed by a pre-select filter and an antenna. The receiving path (Rx) is composed of a UWB low-noise amplifier such as proposed in [122], a self-mixing rectifier, a baseband amplifier and a gated integrator.

![Figure B.1](image.png)

**Figure B.1.** Typical architecture of a fully integrated non-coherent UWB transceiver using the energy collection approach.

This paper reports on the design and implementation of the wavelet generator in the transmitter path. Section B.2 and Section B.3 discuss
B.2. UWB Signal Generation Techniques

We can distinguish between two main classes of pulsed UWB waveforms. The first class of signals is based on the Gaussian monocycle pulse and provides signals with only one or a small number of cycles. In practice, 2 to 3 cycles can be generated by derivations of the Gaussian monocycle. The second class of waveforms features a larger number of cycles and can be realized by the amplitude modulation of a sine wave. At equivalent centre frequency, the latter class provides signals that can be seen as wavelets with longer durations and thus narrower bandwidths.

These two classes of waveforms are actually strongly related to their respective generation method. UWB signals belonging to the second class are generated by gated oscillators (e.g., tunnel diode or oscillator with output gate [174]) or up-converted baseband pulse such as proposed in [160] and [175]. Monocycle UWB signals can be generated either by switching circuits (e.g., fast logic circuit such as in [176]) or techniques involving current such as translinear circuits [177] [178], $g_{m}$-C cells [179] or step-recovery diode used in conjunction with microstrip lines [180].

Very often, most of the aforementioned solutions are not viable for standard silicon implementations. Low-cost integrated technologies such as CMOS or BiCMOS preclude the use of components like step recovery or tunnel diodes since they are not part of standard devices libraries. Furthermore, one of the main issues when using very short Gaussian monocycles at low pulse rates (typically below 10 Mps$^2$) is the peak voltage needed to exploit the allowed UWB power density levels. With short-duration waveforms such as a Gaussian monocycle centered at 4 GHz, which features a peak-to-peak duration of 200 ps, a peak-to-peak voltage of more than 5 V is needed to reach the maximal allowed power density of -41.3 dBm/MHz [8] at a pulse rate of 10 Mps. Thus, this high voltage requirement prevents such waveforms from being gen-

---

\footnote{It is assumed that UWB low-power transceivers using non-coherent pulse position modulation (PPM) schemes typically do not exploit pulse rates above 10 Mps. This is mainly due to the limitations of the channel characteristics. Strong multipath echoes in indoor environments increase the delay spread up to 100 ns or more.}
erated by a deep-submicron CMOS technology, where the transistor breakdown voltages are typically well below 3 V.

B.3. Design Constraints

Some design choices for emitted UWB signals are spawned by regulation. The emission limit released by the US-based FCC consists essentially of the allowed frequency range from 3.1 to 10.6 GHz and the minimally required -10 dB bandwidth as well as the average and peak power constraints. For a sine signal modulated by a square pulse, whose -10 dB bandwidth extends beyond 500 MHz, this translates into a pulse whose duration is smaller than 3 ns.

Furthermore, since the pulse repetition rate of non-coherent UWB schemes is typically in the order of 10 Mp/s or less, the equivalent duty cycle of the pulse generator is below 3%. This low duty cycle can be advantageously exploited for low-power applications. By switching the whole circuit completely off during the idle periods, the overall power consumption can be drastically reduced. However, this implies the use of an oscillator with a settling time in the range of one nanosecond or below.

In this work we investigate the realization of pulsed-UWB signal generators based on oscillating circuits and methods to reduce the start-up time of oscillators. Such solutions generate UWB signals having multicyle property or, equivalently, pulses having a bandpass characteristic with narrower bandwidth typically between 500 MHz and half the pulse centre frequency. This type of signal is particularly interesting for on-chip implementation since they allow reducing the peak-to-peak voltage of the pulsed signal.

B.3.1. Oscillator Equivalent Circuit

The goal of this section is to derive an analytical expression that describes the start-up transient phase of an oscillator, whose active device features a nonlinear amplitude limiting function.

As illustrated in Fig. B.5 (middle dashed boxes), the practical oscillator considered for analysis uses a cross-coupled transistor pair and an equivalent “GLC” parallel resonator, which takes into accounts parasitics and other additional components (i.e., load, output and input impedances of the active devices as well as the frequency tuning devices such as varactors). A simplified equivalent circuit is shown in Fig. B.2.
The cross-coupled pair is modelled by a nonlinear negative conductance $g_m(v)$.

A first-order approximation of the nonlinear $i$-$v$ transfer function of a differential pair is given in [181]. The $i$-$v$ characteristic of a cross-coupled differential pair is obtained by inverting the polarity of the input voltage $v$. We obtain

$$i_P(v) = \begin{cases} -I_b & \text{if } v \geq \sqrt{2}V_{od} \\ -\frac{I_b}{V_{od}} v \sqrt{1 - \frac{v^2}{4V_{od}^2}} & \text{if } -\sqrt{2}V_{od} < v < \sqrt{2}V_{od} \\ I_b & \text{if } v \leq -\sqrt{2}V_{od} \end{cases} \quad (B.1)$$

with

$$V_{od} = \sqrt{\frac{I_b}{\mu_n C_{ox} W/L}}, \quad (B.2)$$

$V_{od}$ is the equilibrium overdrive voltage and $\pm \sqrt{2}V_{od}$ is the maximum voltage that the circuit can handle and corresponds to the nearly off state of one of the transistors. $\mu_n$ is the carrier mobility, $C_{ox}$ the oxide capacitance per unit area, $W$ the width and $L$ the effective length of the MOS transistors and $I_b$ the tail current of the differential pair.

In order to derive an analytical solution, this characteristic is approximated by a transfer function of the following form

$$i_P(v) = av - bv^3, \quad (B.3)$$

where $a$ and $b$ are constants. A Taylor expansion of the nonlinear transfer function provides $a = -I_b/V_{od} = -g_m$ and $b = (1/8) \cdot I_b/V_{od}^3$. 

**Figure B.2.** Equivalent schematic of the cross-coupled oscillator. The amplitude limiting device is a cross-coupled transistor pair which is modelled by a nonlinear negative conductance $g_m(v)$. 

---

**Note:** The exact values for $a$ and $b$ are approximated for practical purposes in the context of the oscillator design.
where \( g_m \) is the equivalent small-signal transconductance of the differential pair. Fig. B.3 depicts both Eq. B.1 and the Taylor approximation defining the nonlinear conductance \( g_m(v) \).

Figure B.3. Transfer function according to Eq. B.1 and Taylor approximation of the nonlinear input transfer function defining the conductance \( g_m(v) \).

The equation describing the circuit illustrated in Fig. B.2 is

\[
i_C + i_G + i_P + i_L = 0,
\]

which can be rewritten as

\[
\frac{d}{dt} i_C + \frac{d}{dt} i_G + \frac{d}{dt} i_P + \frac{d}{dt} i_L = 0.
\]

Furthermore we have \( di_L/dt = v/L \), \( i_G = C \cdot dv/dt \) and, from Eq. B.4, \( di_P/dt = a \cdot dv/dt - 3bv^2 \cdot dv/dt \). We replace these expressions in Eq. B.5 to obtain

\[
C \ddot{v} + \left[ G - a + 3bv^2 \right] \dot{v} + \frac{v}{L} = 0.
\]

The factor into brackets appears as to be the derivative of the equivalent voltage-dependent resistive part of the oscillating system. We modify
then this term by factorizing with $g_m - G$:

$$G_{nl}(v) = G - g_m + \frac{3}{8} \frac{g_m}{V_{od}^2} v^2$$

$$= (g_m - G) \left[ -1 + \left( \frac{3}{8} \frac{g_m}{V_{od}^2} \frac{1}{g_m - G} \right) v^2 \right],$$

and we get for the differential equation

$$C \ddot{v} + G_0 \left[ \left( \frac{v}{v_0} \right)^2 - 1 \right] \dot{v} + \frac{v}{L} = 0,$$

(B.7)

where $G_0 = g_m - G$ is the equivalent conductance and

$$v_0 = 2V_{od} \sqrt{\frac{2}{3} \left( 1 - \frac{G}{g_m} \right)}$$

(B.8)

represents a normalizing factor, which is related to the output amplitude.

By normalizing Eq. B.7 by $\sqrt{L/C}/v_0$ we get, after replacing $v$ by $v_0 y$,

$$\ddot{y} + \epsilon \omega_0 \left[ y^2 - 1 \right] \dot{y} + \omega_0^2 y = 0,$$

(B.9)

where $\omega_0 = \frac{1}{\sqrt{LC}}$ is the natural angular frequency of the oscillator and $\epsilon = G_0 \sqrt{L/C}$ is a damping factor and has the dimension of the inverse of the electrical quality factor.

By using the substitution $\tau = \omega_0 t$ the above equation can be reduced to the well-know “van der Pol” equation [182]

$$x + \dot{x} \cdot \epsilon \cdot (x^2 - 1) + \ddot{x} = 0.$$

(B.10)

Finding an exact solution to the “van der Pol” equation turns out to be not easy. The solving method depends on the value of $\epsilon$. First, the existence of stable limit cycles is given by the theorem of Levinson-Smith and is guaranteed provided $\epsilon \geq 0$ or, equivalently, $g_m \geq G_0$. However, in the proposed high-frequency implementation, $\epsilon$ stays typically smaller than one, and the oscillator’s behavior is close to that of an harmonic oscillator. For small $\epsilon$ (typically $0 < \epsilon < 1$), we can get a good approximation of the exact solution to Eq. B.7 by using the
averaging method [183]:

\[
\nu(t) \approx \frac{2v_0}{\sqrt{1 + \left(\frac{2v_0}{v(0)}\right)^2 - 1}} \cdot \cos (\omega_0 t - \varphi).
\]

(B.11)

The cosine term in the above equation represents the oscillations with frequency \(\omega_0\) and phase \(\varphi\). The latter depends on initial conditions. The first term of the right hand side describes the time-varying amplitude \(A(t)\) of the oscillations (envelope). \(A(t)\) is depicted in Fig. B.4 for two values of the initial condition \(v(0)\). The numerical solution of \(v(t)\) for the case of the large initial condition has been depicted for comparison with the estimated expression of the envelope. The case illustrated here shows a simulation for \(\epsilon \approx 0.4\). We notice a very good agreement between the numerical solution and the approximation of the envelope \(A(t)\).

B.3.2. Wavelet Generator’s Settling Time

We define the overall settling time \(t_s\) as the time needed for the oscillation to reach 90 % of its steady-state amplitude. This overall settling time \(t_s\) can be separated into two phases: i) the oscillation’s onset delay \(t_d\) and ii) the oscillation’s rise time \(t_r\). The rise time is defined as the time needed for an oscillating signal to grow from 10 % to 90 % of its steady-state amplitude. The onset delay \(t_d\) defines the absolute time needed for an oscillator to provide an output amplitude that reaches 10 % of the steady-state amplitude from a given small initial condition \(v(0)\), that is, an initial condition whose value is smaller than 10 % of the steady-state amplitude. We define the overall settling time \(t_s\) of the oscillator as the sum of \(t_r\) and \(t_d\). From Eq. B.11 we can express \(t_d\), \(t_s\) and \(t_r\) as follows:

\[
t_d \approx \frac{1}{\epsilon \omega_0} \left[2 \ln \left(\frac{2v_0}{v(0)} - 1\right) - 4.6\right]
= \frac{Q_{eq,res}}{\omega_0(A_{OL} - 1)} \left[2 \ln \left(\frac{2v_0}{v(0)} - 1\right) - 4.6\right],
\]

(B.12)
Figure B.4. Envelope function $A(t)$ for different initial conditions $v(0)$ according to Eq. B.11. The plain curve illustrates the overall settling time of a circuit with an initial condition $v(0) = 10$ mV, which is a thousand times higher than the one represented by the dotted curve, where $v(0) = 10$ $\mu$V. The numerical solution for $v(t)$ is given for reference (dashed line). The equation parameters are $A_{OL} = 4.1$, $Q_{eq, res} = 5.8$ and $\omega_{osc} = 2\pi \cdot 4.5 \cdot 10^9$. $t_d$ is the onset delay and $t_r$ represents the rise time of the signal’s envelope (see Section B.3.2 for details).
\[ t_s \approx \frac{1}{\epsilon \omega_0} \left[ 2 \ln \left( \frac{2v_0}{v(0)} - 1 \right) + 1.45 \right] \]

\[ = \frac{Q_{eq, res}}{\omega_0(A_{OL} - 1)} \left[ 2 \ln \left( \frac{2v_0}{v(0)} - 1 \right) + 1.45 \right] , \]

and for the rise time:

\[ t_r = t_s - t_d \approx \frac{6.05}{\epsilon \omega_0} = \frac{6.05 \cdot Q_{eq, res}}{\omega_0(A_{OL} - 1)} \]

where \( 2v_0 \) is the steady-state oscillation amplitude, which can be derived from the nonlinear expression of the i-v characteristic of the active device and is given by Eq. B.8. \( v(0) \) is the initial condition. The equations above have been expressed with practical parameters of an oscillator as well, that is the open-loop gain \( A_{OL} = g_m/G \) and the quality factor of the passive “GLC” parallel resonator \( Q_{eq, res} = G^{-1} \sqrt{C/L} \).

Eq. B.12 and Eq. B.14 both show that the rise time \( t_r \) depends only on the open-loop gain \( A_{OL} \) and the quality factor \( Q_{eq, res} \) of the resonator, while the settling delay \( t_d \) depends on these parameters as well as the initial condition \( v(0) \). To obtain a sufficient short overall settling time, both time effects have to be considered in practice. In the case of switched oscillators for an UWB wavelet generator the onset delay \( t_d \) should be as short as possible in order to increase the overall efficiency and \( t_r \) must be smaller than half the pulse length. Fig. B.4 illustrates the effect of different initial condition \( v(0) \) on the overall settling \( t_s \) of the oscillating signal. Small initial conditions impose longer delays on the onset of the output signal.

B.4. Circuit Implementation

B.4.1. Oscillator Core

The oscillating circuit used in the proposed implementation is depicted in Fig. B.5 (middle dashed box) and is based on a cross-coupled differential pair [181]. This circuit features a negative \( g_m \) at its output and can be seen as the transistor version of negative resistance devices (NRD) such as tunnel diodes. The proposed circuit has been implemented exclusively with CMOS transistors as provided by the IBM BiCMOS7HP technology. Although bipolar transistors provide higher
**Figure B.5.** Simplified circuit schematic of the entire wavelet generator. The middle dashed box contains the active circuit, which is fed by a modified tail current source (lower middle box). The resonator is depicted in the upper center dashed box. The leftmost part represents the start-up circuit. Output buffers with on-chip capacitors are depicted on the right-hand side (see text of Section B.4 for detailed explanations).
the circuit has been realized in CMOS to verify the performance
of a wavelet generator using low-cost silicon technology. Each transistor has a $W / L$ ratio of $240 \ \mu m / 0.18 \ \mu m$ to provide a sufficient $g_m$. The technology’s typical transconductance $k_n = \mu_n C_{ox}$ equals approximately $40 \ \mu A/V^2$ for n-channel MOS transistors.

**B.4.2. Resonator Load**

The resonator is depicted in the upper center dashed box. Varactors enable an external analog voltage $V_{freq}$ to modify the center frequency of the oscillator. Eventual process variations can be compensated, thus enabling the generated signal to comply with the system’s specification. To achieve a reduced overall settling time and a reduced ringing at the oscillator extinction, the quality factor $Q$ of the resonator has been halved from 20 (on-chip 0.6 nH inductor’s quality factor) to 10 by adding a parallel resistor $R = 1/G = 200 \ \Omega$. Note that reducing the $Q$-value of the resonator implies a higher oscillator phase noise level. However, since the effect of the switched oscillator can be seen as the amplitude modulation of an equivalent continuous wave (CW) signal by a very short pulse, the corresponding spectrum is extended far beyond a slight increase caused by the phase noise degradation. In other words, the envelope of the spectrum is defined by the short pulse modulation, while the purity of the spectral lines formed by the pulse repetition are slightly modified by the phase noise of the oscillator. The latter doesn’t play any role in the spectrum envelope, which is the most important for UWB signals.

**B.4.3. Modified Tail Current Source**

Usually, in continuous wave oscillators as used in narrow band transmitters, the oscillations start from very small initial conditions imposed by the intrinsic thermal noise, which is typically on the order of nanovolts. The resulting settling time extends far above one nanosecond and is further increased by the high quality factors required for these applications. Based on the investigations in Section B.3.2, we can reduce the overall settling time $t_s$ by forcing the oscillator to start with larger initial conditions $v(0)$, rather than simply relying on the intrinsic circuit noise. This can be achieved with a slight modification of the tail current source of the cross-coupled pair and an additional logic circuit as depicted in Fig. B.5.

The modified tail current source of the cross-coupled pair (lower cen-
tered dashed box) consists of two tail current sources $I_b/2$, which can be independently triggered by a logic circuit represented in the leftmost dashed box. The logic block produces a slight delay $\Delta t$ at the onset of only one of the two tail current sources. Before the activation of the oscillator core, both transistors are in the off-state and present the same high impedance at their source nodes. During the startup process, the tail current that flows through the first activated source pulls down the source nodes. As long as the impedance at the source nodes remains similar, the equivalent circuit seen by the triggered source is a current divider, thus forcing a net current $I_b/4$ flows through the capacitor $C_S$. The voltage $\Delta V_S$ through the capacitor $C_S$ at $t = \Delta t$ is

$$\Delta V_S = \frac{I_b \Delta t}{4C_S}. \quad (B.15)$$

This voltage sets an imbalance in the differential pair by further pulling down the source of the transistor being on the side of the first activated current source. This voltage difference is passed on to the resonator by the cross-coupled implementation, thus creating the required large signal initial condition.

The capacitor $C_S$ connected between the current source plays a second important role. This capacitor ensures a proper function of the cross-coupled pair during the oscillating phase by creating a low impedance path between the transistors’ sources. Thus, the value of the capacitor results from a trade-off between the quality of the low impedance path forming the common mode node of the differential pair and the amplitude of the initial conditions given by Eq. B.15. In practice, an impedance of a less than ten Ohms should be chosen, so that the negative small-signal $g_m$ is not significantly degraded (source degeneration). At 4 GHz, this leads to a capacitor value on the order of a few picofarads. This value further determines the amount of delay needed to create the source voltage imbalance at startup. To obtain large signal initial conditions, $\Delta V_S$ should be set on the order of the threshold voltage. By choosing, $\Delta V_S \approx 100$ mV and $I_b = 30$ mA, Eq. B.15 gives a delay $\Delta t$ of approximately 50 ps. This delay can be implemented by cascading several inverters as described in the next section.

**B.4.4. Start-Up Circuit**

The leftmost box of Fig. B.5 contains a simplified schematic of the triggering circuit. The digital signal $V_{\text{trig}}$ is the main control signal that can be easily provided by the digital output of a baseband processor. Its
length is inversely proportional to the bandwidth of the UWB wavelet. $V_{\text{trig}}$ is split into two signals, each triggers one of the current source $I_b/2$ but only of these signals is delayed by a cascade of four inverters ($\Delta$ blocks). The signal $V_{\text{phase}}$ sets the two pass-gate multiplexers in such a manner that only one delayed path is routed to only one of the current source. Here, $V_{\text{phase}}$ high impose a delay on the leftmost current source.

Another interesting feature that comes with the addition of a triggering circuit - used in conjunction with a fully symmetrical start-up circuit - is the possibility to create bi-phase signals. By starting the pulse randomly with phase $0^\circ$ or $180^\circ$, the spectral lines that appear with a periodic signal can be reduced. Thereby, a more homogeneous power spectral density is created (dithering effect), with the further benefits that this also allows higher average emitted power.

### B.4.5. Output Buffers

Additional output buffers with 2.5 pF on-chip decoupling capacitors are implemented to isolate the oscillator core from the impedance variations of the load and thus to ensure a stable start-up behavior. A source follower topology is used for the output transistors, whose bias source $I_0$ is switched on by the digital trigger signal only during the time of emission. This reduces the overall power consumption of the entire circuit. Furthermore, disabling the output buffer at the end of a pulse reduces the ringings of the resonator and thus avoids narrowing the signal bandwidth.

### B.5. Results

The circuit has been measured on a Duroid test substrate connected to an oscilloscope and a spectrum analyzer via microstrip lines and coaxial connections. Each port of the differential output has been measured on a separate channel of the oscilloscope. Each channel input presents a $50 \ \Omega$ load to the generator output. Consequently, the equivalent load for the differential output signal is $100 \ \Omega$. Fig. B.6 shows a comparison between a simulated differential output signal and the sampling oscilloscope’s differential channel measurements. Both simulation and measurements have the same bias conditions. Although short, the digital trigger signal of 2.5 ns duration that has been chosen clearly leads to the three oscillator’s phases (rise time, steady-state and extinction
Figure B.6. Simulated (a) and measured (b) differential outputs loaded by an equivalent impedance of 100 Ω. The measured peak-to-peak voltage reaches 900 mV and the pulse length is 2.5 ns. A reduced number of bondwires to the test substrate ground causes a slight discrepancy during the envelope rise between $t = 0.5$ ns and $t = 1.5$ ns.

phase) in the generated wavelet, which complies with the UWB bandwidth requirements. The oscillator starts within less than one nanosecond, which matches well with the calculations illustrated in Fig. B.4. At time 2 ns, we notice a short steady-state phase, which is preceding the extinction phase, whose duration is approximately 1 ns. Additionally, Fig. B.6-b illustrates the bi-phase signaling ability of the wavelet generator; both measurements show two wavelets superimposed, where phases 0° and 180° correspond to $V_{\text{phase}} = 0$ V and $V_{\text{phase}} = 1.8$ V, respectively. A good phase inversion behavior of 180° ± 10° is attained owing to the balanced circuit implementation.

The power spectral densities (PSD) of the measured and simulated output signals shown in Fig. B.6 are illustrated in Fig. B.7. This graph shows the calculated average PSD in dBm/MHz for a pulse repetition rate of 10 Mp/s. The measured bandwidth is 1.2 GHz and is centered at 4.5 GHz, which compares well with the simulation results. For the measured case, we notice a small spectral outgrowth around 4 GHz, which widens the bandwidth somewhat. This component originates from a
Figure B.7. Calculated PSD of the measured and simulated signal (in dBm/MHz). The -10 dB bandwidth is 1.2 GHz. The UWB mask is shown as a reference.

slight sweep of the instantaneous frequency during the wavelet’s transient extinction phase beyond $t = 2$ ns. The resonator is loaded by an additional capacitance formed by the gate oxide capacitance in series with the depletion capacitance between the channel and the transistor’s substrate.

The measured peak PSD is less than 3 dB lower than the simulated peak PSD. This discrepancy can be explained partly by a slower than expected envelope rise. The most likely causes are underestimated on-chip parasitic effects and a limited number of ground pads, which increase the equivalent impedance connected in series with the tail current source, especially during the start-up phase (inductive bondwire effect). The measurement cables and the substrate board account for slightly less than 1 dB difference.

However, the output amplitude of the wavelet generator can be modified by adjusting the bias of the differential pair. Thus, losses that occur in a T/R switch or a pre-select filter can be easily compensated.

Another advantage of such a topology is that the spectrum is absolutely free from parasitic spikes. These spectrum lines occur typically in wavelet generators using up-conversion mixers with non-ideal LO-
B.6. Conclusion

RF leakage characteristic or output gates having a non-ideal forward isolation (especially at low pulse rates).

Contrary to the solution using a mixer, which enables a more accurate bandwidth and sidelobe control by choosing the appropriate modulating signal, our implementation does not provide any means to control the spectrum, except by choosing the pulse length. The FCC’s UWB mask for indoor applications [9] is illustrated as a reference in Fig B.7. The violation of the UWB mask between 1 and 2 GHz can be easily corrected by the additional attenuation of the pre-select filter in this band or by simply shifting the high-pass corner frequency defined by the series output capacitors.

At a pulse repetition rate of 10 Mp/s, the circuit consumes an average power of only 1.8 mW at a nominal voltage of 1.8 V. In this configuration, the oscillator is switched on during 2.5 ns every 100 ns, which correspond to a duty cycle of 1/40 with a peak power requirements of 70 mW. Equivalently, the energy per pulse needed is 180 pJ. The average signal power delivered to an equivalent balanced 100 Ω load is -20 dBm (10 μW). The circuit is able to drive a 50 Ω with up to 20 μW output power, thus having an overall efficiency above 1 %.

The total chip area including pads measures $0.92 \times 1.32 \ mm^2$, while the active area has a size of approximately $0.6 \times 0.95 \ mm^2$. A photograph of the chip is shown in Fig. B.8. This fully integrated circuit implementation is significantly cheaper than other known pulse generators requiring external devices, since a standard commercial $0.18 \ μm$ CMOS technology process costs only approximately 0.1 $\$/US/mm$ for mass production.

A summary of the characteristics of the proposed realization is provided in Table B.1 and compared to other pulse generator implementations. To obtain a fair comparison, the measured power consumption have to be weighted by the output peak-to-peak voltage given in the last column.

B.6. Conclusion

The design, realization and results of a fast starting, low-power UWB wavelet generator have been presented. An oscillating core based on a cross-coupled differential pair has been used to generate a multicycle UWB signal. The entire circuit including output buffers is only switched on during the actual wavelet emission period, such that its power consumption is linearly dependent on the pulse repetition rate and in the
Appendix B: Low-power UWB Wavelets Generator

**Figure B.8.** Chip photograph of the UWB wavelet generator, the total chip area measures $0.92 \times 1.32 \text{mm}^2$. 

**Table B.1.** Summary of the performances and comparison with previously reported implementations.

<table>
<thead>
<tr>
<th>Ref.</th>
<th>Technology [(\mu m)]</th>
<th>sim./meas.</th>
<th>Pulse type</th>
<th>Modul.</th>
<th>En./p. [nJ/p]</th>
<th>Output (50 (\Omega))</th>
</tr>
</thead>
<tbody>
<tr>
<td>This work</td>
<td>CMOS 0.18</td>
<td>meas.</td>
<td>multicycle ((&gt;2) ns)</td>
<td>PPM, Freq., (\phi)</td>
<td>180</td>
<td>0.9 Vpp</td>
</tr>
<tr>
<td>[160]</td>
<td>CMOS 0.18 (up-conversion)</td>
<td>meas.</td>
<td>multicycle (1.1-4.5 ns)</td>
<td>PPM, Freq.</td>
<td>50</td>
<td>0.2 Vpp</td>
</tr>
<tr>
<td>[179]</td>
<td>CMOS 0.35 ((g_m)-C cells)</td>
<td>sim.</td>
<td>monocycle (300 ps)</td>
<td>PPM</td>
<td>100</td>
<td>20 mVpp</td>
</tr>
<tr>
<td>[175]</td>
<td>BiCMOS 0.18 (up-conversion)</td>
<td>meas.</td>
<td>multicycle (3 ns)</td>
<td>PPM, Freq., (\phi)</td>
<td>300</td>
<td>0.2 Vpp</td>
</tr>
<tr>
<td>[176]</td>
<td>CMOS 0.18 (delayed-tap)</td>
<td>sim.</td>
<td>monocycle (200 ps)</td>
<td>PPM, Freq.</td>
<td>230</td>
<td>1.8 Vpp</td>
</tr>
<tr>
<td>[180]</td>
<td>Discrete (SRD)</td>
<td>meas.</td>
<td>monocycle (300 ps)</td>
<td>PPM</td>
<td>n.a.</td>
<td>0.4 Vpp</td>
</tr>
</tbody>
</table>
sub-mW range for a pulse rate below 5 Mp/s. The UWB signal’s spectrum can be extended beyond 1 GHz and is centered at 4.5 GHz. Only two external low-speed digital signals have to be provided to drive the proposed wavelet generator.
A Spectrum-Shaping Output Stage for IR-UWB Tx

Abstract - This paper presents a novel on-chip wideband balanced-to-unbalanced (balun) transition circuit based on a complementary current inversion method. The proposed circuit features an easily-controllable wide bandpass characteristic and is intended to be used as an output stage to filter ill-defined spectra of time-domain UWB pulses to fit regulation masks. We investigated the circuit in an active topology driven by differential pairs. A mathematical analysis is presented as well as the implementation of an integrated circuit realized with an IBM BiCMOS 0.18 µm technology.

C.1. Introduction

Interfaces between Radio-Frequency Integrated Circuits (RFIC) and off-chip components such as antennas (or filters) are often bottlenecks in radio front-ends design. Antennas and off-chip ceramic filters mostly use an unbalanced topology, while most of the modern RFIC are designed with a balanced topology to exploit its higher power supply noise immunity, achieve better image-rejection in mixer or enable designs with very low second-order distortion. In the field of Ultra-Wideband Radio Technology (UWB-RT), the design of interfaces poses additional challenges due to the signal’s large bandwidth.

For both U.S. and European regulations, UWB signals must fit into well-defined spectrum masks [8,12]. In this paper, we concentrate on the U.S. regulatory constraints [8] for indoor operation. Achieving op-
Appendix C: A Spectrum-Shaping Output Stage for IR-UWB Tx

Optimal spectrum occupancy in order to obtain the maximum output power (and thus, more reliable data transmission links) may turn out to be difficult, especially for Impulse-Radio UWB (IR-UWB) technologies, whose low duty-cycled signals feature very fast signal edges that cannot be accurately controlled. On the other hand, emission limits for out-of-band UWB signals are very stringent, in particular for the frequency bands occupied by the Global Positioning System (GPS) or Radio Astronomy Services (RAS). These constraints must be considered in the design of output stages for IR-UWB.

The second issue concerns the transition from differential to single-ended signals. In output stages and drivers, this transition is usually realized by the stacked transistor arrangement of a source-follower (or common-drain) with a common-source transistor [184], which relies on the accurate matching of the gain between the two stages. This topology, however, requires an additional active stage with some voltage headroom and can consequently not be stacked (current reuse) to avoid additional current consumption. A common solution, popular since the mid 1990’s, includes the use of passive baluns realized with the help of monolithic transformers [185,186]. This solution requires specific models which are often not part of IC design kits.

As shown in Fig. C.1, this work presents a fully integrated output stage for UWB transmitters emitting signals in the lower UWB band, as specified by the FCC. The proposed novel topology enables two features: 1) a transformer-less differential to single-ended transition for multi-GHz bandwidths and 2) spectrum-shaping of ill-defined time domain IR-UWB pulses. The paper focuses on the mathematical expressions that describes the circuit behavior as well as on the realization and the measurements of an integrated circuit. It will show that the combined response of a IR-UWB signal generator (see, e.g., [86]) and the proposed output stage satisfies the FCC spectral mask for indoor application in the lower part of the UWB spectrum.

In the second section of this paper, the principle of the proposed output stage is presented. The third section deals with the implementation of a fully integrated circuit. Measurement results, discussion and applications are provided in the fourth and fifth sections.

C.2. Output Stage Principle

In RF designs, a differential topology is usually kept until the very last stage in order to have better power-supply noise immunity. Convert-
C.2. Output Stage Principle

Figure C.1. Type of IR-UWB transmitter considered. The design of the output stage is subject of this paper.

ing balanced signals into single-ended ones is an easy task for narrow-band communication where classical passive current inverters based on lumped elements can be used [187]. Such a signal transition turns out to be much more complex for multi-GHz bandwidth.

C.2.1. Basic Current-Inverter Network

Before we investigate the differential to single-ended transition principle for UWB applications, let us consider the basic circuit topology for the realization of this function: the passive current inverter. A current inversion can be obtained by an inductance-capacitance (LC) network as depicted in Fig. C.2. We first analyze the ideal case, where \( R = 0 \).

The current transfer function \( H_{\text{inv}} \) can be expressed as:

\[
H_{\text{inv}}(\omega) = \frac{i_{\text{inv}}(\omega)}{i_{\text{in}}(\omega)} = \frac{\omega^2 LC}{1 - \omega^2 LC} \tag{C.1}
\]

In this case, the current is actually perfectly inverted in phase from DC up to the resonance frequency \( \omega_0 = 1/\sqrt{LC} \), since \( H_{\text{inv}}(\omega) \) is a real and negative value (dashed line in Fig. C.2). The current amplitude however changes according to the magnitude of the above equation. This amplitude equals unity at \( \omega = \omega_{\text{inv}} = 1/\sqrt{2LC} = \omega_0/\sqrt{2} \) (square markers). At this frequency, the LC-network acts as a unity-gain current inverter.

A nonzero load impedance reduces the equivalent quality factor \( Q_{RC,\text{serie}} \) of the capacitor branch, i.e. \( Q_{RC,\text{serie}} = (\omega RC)^{-1} \). The resistor thus introduces a negative phase shift from the ideal 180° and further reduces the amplitude of inverted current. The current transfer function of Eq. C.1, in this case, is modified according to the following
Figure C.2. Ideal current inverter formed by a high-pass LC-network. For an ideal zero resistance load, the current is perfectly inverted \( i_{\text{in}} = -i_{\text{inv}} \) at \( \omega_{\text{inv}} = 1/\sqrt{2LC} \) (square marker). A nonzero load impedance \( R \) reduces the quality factor of the capacitor branch and introduces a negative phase shift with respect to the ideal 180°, see Eq. C.2 (in this example \( Q = 2.5 \), a round marker shows the non-ideal phase shift of 147° at unity gain frequency \( \omega_u \)).

expression:

\[
H_{\text{inv}}(s) = \frac{s^2}{s^2 + s\frac{\omega_0}{Q} + \omega_0^2},
\]

where \( \omega_0 = 1/\sqrt{LC} \), \( Q = 1/R\sqrt{L/C} \) and \( s = j\omega \). The current transfer function \( H_{\text{inv}}(s) \) is thus a second-order high-pass transfer function. Solving this equation to find \( |H_{\text{inv}}(s)| = 1 \) (unity-gain current inversion) leads to a quartic equation for \( s \) [188]. The angular frequency \( \omega_u \) at which unity gain occurs is

\[
\omega_u = \frac{\omega_{\text{inv}}}{\sqrt{1 - \frac{1}{2Q^2}}} = \frac{\omega_0}{\sqrt{2 - \frac{1}{Q^2}}}. \tag{C.3}
\]

In summary, we observe from Fig. C.2 that the bandwidth of such a current inverter topology is limited by the non-flat amplitude of the transfer function for high quality factors and by the deterioration of the phase transfer function for low \( Q_{RC} \) topologies.
C.2. Differential to Single-Ended Conversion

The differential to single-ended conversion is obtained by adding the inverted current \( i_{\text{inv}}(s) \) to the current \(-i\) provided by the negative output of a differential pair. The circuit’s principle is depicted in Fig. C.3. By applying current division between the load \( R \) and the resonator built by \( L \) and \( C \), the current contribution of the negative output \(-i\) to the load \( R \) can be calculated as:

\[
i_{\text{neg}}(s) = -i \cdot H_{\text{neg}}(s) = -i \cdot \frac{s^2 + \omega_0^2}{s^2 + s \frac{\omega_0}{Q} + \omega_0^2}, \tag{C.4}
\]

where \( \omega_0 \), \( Q \) and \( s \) have been previously defined in Eq. C.2. The resulting output current \( i_{\text{out}}(s) \) is the superposition of \( i_{\text{neg}}(s) \) and \( i_{\text{inv}}(s) \) and results in a second order low-pass function:

\[
i_{\text{unbal,LP}}(s) = i_{\text{neg}}(s) + i_{\text{inv}}(s) = \frac{-i \cdot \omega_0^2}{s^2 + s \frac{\omega_0}{Q} + \omega_0^2}. \tag{C.5}
\]
Bode plots of Fig. C.3 show that both currents are added in phase (dotted and dashed lines) but that a true balun behavior only occurs at the frequency where both amplitudes are equal, i.e. $\omega = \omega_0 / \sqrt{2}$ (triangle marker). The resulting transfer function $H_{\text{unbal,LP}}(s)$ is now a second-order low-pass function, while the current inversion $H_{\text{inv}}(s)$ was a second order high-pass function.

C.2.3. Proposed Topology

The proposed output stage is based on the complementary arrangement of a high-pass and low-pass current inversion to flatten the transfer function of the output stage. An equivalent schematic is depicted in Fig. C.4.

![Figure C.4. Simplified schematic of the output stage. Nodes “n1” and “n2” are the inputs of the high-pass and low-pass current inverters, respectively. The output of both current inverters are summed in a common load R (“summing node”).](image)

The resulting single-ended output currents $i_{\text{unbal,[LP,HP]}}$ of both low-pass and high-pass baluns are summed in a common load $R$ to obtain the output current $i_{\text{out}}$ (summing node). These output currents are actually loaded by the resonator of the complementary inverter (e.g., resonator $L_2-C_2$ loads the output $i_{\text{unbal,LP}}$). It is possible to calculate the exact expression with the help of the superposition principle, but this will result in complex 4th-order expressions. Simplifications are however possible by considering only the effect of the complementary LC loads on the currents coming out of the negative outputs of the differential pairs. On the other hand, the effect of a complementary LC load on the inverted current (e.g., resonator $L_2-C_2$ on $i_{\text{inv,HP}}$) can
be neglected, since the net effect is a notch appearing in the stopband, where the current (i.e., $i_{\text{inv,HP}}$) is attenuated. Fig. C.5 shows the equivalent schematic used for the approximation. This simplification allows to rewrite Eq. C.4 as follows:

$$i_{\text{neg}}(s) \approx \hat{i}_{\text{neg}}(s) = -i \cdot \frac{s^2 + \omega_{01}^2}{s^2 + s \frac{\omega_{01}}{Q_1} + \omega_{01}^2} \cdot \frac{s^2 + \omega_{02}^2}{s^2 + s \frac{\omega_{02}}{Q_2} + \omega_{02}^2}, \quad (C.6)$$

![Figure C.5. Approximation of $i_{\text{neg}}(s)$. The amplitude of the transfer function located between the two resonances exhibits a small discrepancy, while phase shows an almost perfect match.](image)

The transfer function $H_s(s)$ at the summing node is calculated by adding both balun currents $i_{\text{unbal,LP}}(s)$ and $i_{\text{unbal,HP}}(s)$ with the contribution of the negative outputs, i.e. $2 \cdot \hat{i}_{\text{neg}}(s)$. We thus obtain an analytical approximation $\hat{H}_s(s)$ for the current transfer function $i_{\text{out}}/i$:

$$\hat{H}_s(s) = \frac{i_{\text{inv,HP}}(s) + i_{\text{inv,LP}}(s) + 2 \cdot \hat{i}_{\text{neg}}(s)}{i}$$

$$= -\left(s^4 + 2s^2 \omega_{02}^2 + \omega_{01}^2 \omega_{02}^2\right) + \left(s \frac{\omega_{01}^2 \omega_{02}}{Q_2} + s^3 \omega_{01}^2\right)\left(s^2 + s \frac{\omega_{01}}{Q_1} + \omega_{01}^2\right)$$

$$\times \left(s^2 + s \frac{\omega_{02}}{Q_2} + \omega_{02}^2\right) \quad (C.7)$$
It can be shown that, except in the close vicinity of the two frequency notches, the magnitude of the left-handed term in parenthesis in the numerator of Eq. C.7 is much larger than the right-handed one containing the first and third order terms. Thus, by neglecting the right-handed term in the numerator and after factorization, a simple analytical approximation $\hat{H}_s(s)$ of the exact transfer function $H_s(s)$ can be obtained:

$$H_s(s) \approx \hat{H}_s(s) = -\frac{s^2 + \omega_{nH}^2}{s^2 + s\frac{\omega_{01}}{Q_1} + \omega_{01}^2} \cdot \frac{s^2 + \omega_{nL}^2}{s^2 + s\frac{\omega_{02}}{Q_2} + \omega_{02}^2}, \quad (C.8)$$

where $\omega_{nL}$ and $\omega_{nH}$ are the zeros of the transfer function and correspond to the low- and high-frequency notch, respectively, and are given by:

$$\omega_{nL} = \omega_0 \sqrt{1 - \sqrt{1 - (\omega_{01}/\omega_{02})^2}}, \quad (C.9)$$

$$\omega_{nH} = \omega_0 \sqrt{1 + \sqrt{1 - (\omega_{01}/\omega_{02})^2}}. \quad (C.10)$$

The resulting transfer function can thus be approximated by the cascade of a notched low-pass and high-pass biquadratic filters. A comparison with the exact numerical evaluation is reported in Fig. C.6 and shows that the previous approximations enable a reliable filter synthesis. The apparition of notches on both sides of the bandpass can be used to shape IR-UWB pulses and to reduce the amount of energy in restricted UWB bands.

### C.2.4. Design Guidelines for Synthesis

In the previous section, the development of the filter’s equations have been made around the normalized angular frequency $\omega = 1$. We give hereafter some design guidelines to determine the passive component values $L_1, C_1, L_2, C_2$ and $R$, which will define the filter characteristic around an arbitrary center frequency $\omega_c$.

We start by specifying the two notch frequencies $\omega_{nL}$ and $\omega_{nH}$. From Eq. C.9-C.10, we extract the poles $\omega_{01}$ and $\omega_{02}$ of the notched transfer function of Eq. C.8:

$$\omega_{01} = \sqrt{\frac{2\omega_{nL}^2\omega_{nH}^2 - 2\omega_{nH}^4\omega_{nL}^2}{\omega_{nL}^4 - \omega_{nH}^4}}, \quad (C.11)$$

$$\omega_{02} = \sqrt{\frac{\omega_{nL}^4 - \omega_{nH}^4}{2\omega_{nL}^2 - 2\omega_{nH}^2}}. \quad (C.12)$$
C.2. Output Stage Principle

The inductance and capacitance values are defined as follows:

\[
L_1 = \frac{Q_1 R}{\omega_c \sqrt{\omega_{02}/\omega_{01}}} , \quad C_1 = \frac{1}{Q_1 R \omega_c \sqrt{\omega_{02}/\omega_{01}}} , \\
L_2 = \frac{Q_2 R \sqrt{\omega_{02}/\omega_{01}}}{\omega_c} , \quad C_2 = \frac{\sqrt{\omega_{02}/\omega_{01}}}{Q_2 R \omega_c} ,
\]

where \(Q_i\) are the pole quality factors of the notched transfer functions, \(R\) is the load and \(\omega_c = \omega_{nL} \omega_{nH}/\sqrt{\omega_{nL} \omega_{nH}}\) is the geometric center frequency defined by the notches.

A numerical example is given hereafter. The transfer function must have a notch at the GPS frequency (1.575 GHz) and cover a bandwidth up to 4.8 GHz. We first define the position of the notches: \(\omega_{nL} = \omega_{GPS} = 2\pi \cdot 1.575\) GHz and \(\omega_{nH} = 2\pi \cdot 7.4\) GHz. The latter notch frequency
Appendix C: A Spectrum-Shaping Output Stage for IR-UWB Tx

ensures a sufficient bandwidth to cover the lower UWB frequency band up to 4.8 GHz, even while considering tolerance on passive components. A sensitivity analysis against process variations is given in Section C.4. For $R = 50 \, \Omega$ and $Q = 2.5$, we obtain: $L_1 = 3.7 \, \text{nH}$, $C_1 = 0.23 \, \text{pF}$, $L_2 = 9.1 \, \text{nH}$ and $C_1 = 0.58 \, \text{pF}$. Figure C.7 shows the resulting exact transfer function and the group delay of the output stage (continuous curves). Non-ideal inductors with a quality factor of $Q_L = 15$ has been additionally considered in the evaluation of the exact transfer function. The dashed curve corresponds to the approximation $\hat{H}_s(s)$ developed in the previous section and defined by Eq. C.8. $\hat{H}_s(s)$ shows an excellent match to the exact simulated $H(s)$ output current.

C.2.5. Gain, Bandwidth and Ripple

In Eqs. C.13-C.14, the specification of the two notch frequencies $\omega_{n[L,H]}$ only fixes the poles $\omega_{0[1,2]}$ and the geometric center frequency $\omega_c$. We first assume the load resistance $R$ as a given design parameter and consequently, both quality factors $Q_{[1,2]}$ are left undefined. The classical filter theory states that the choice of the $Q_{[1,2]}$ will influence the magnitude of the overshoot in the notch transfer functions of Eq. C.8. In the proposed topology, the cascaded arrangement of these notch filters may give rise to a ripple in the bandpass of the transfer function $H_s(s)$. The magnitude of this ripple is evaluated by a numerical simulation of the exact transfer function in Fig. C.8 (dotted contours) with respect to the quality factors ($Q_1=Q_2$) and the notch frequency ratio $\omega_{nH}/\omega_{nL}$. A quality factor $Q_L$ of 15 have been assumed for the inductors in the simulation. The graph has been restricted to an area where the ripple is smaller than 4 dB.

In the same Fig. C.8, the fractional bandwidth $FBW = \frac{f_H - f_L}{(f_H + f_L)/2}$ is reported (thin contours), as well as the intrinsic gain $|H_s(\omega_c)|$ at the center frequency (thick contours). Assuming an ideal and lossless balun, the gain in current $|H_s(\omega_c)|$ of the circuit in Fig. C.4 would be four times the nominal current $i$ (12 dB). In the proposed example in the Fig. C.7, the gain has been simulated around 10 dB. This actually corresponds to an intrinsic loss of 2 dB, or equivalently, to an intrinsic gain of $-2$ dB. This loss is caused by the frequency offset between the unity gain frequency $\omega_u$ of both complementary low- and high-pass current inverter circuits. This frequency offset can be freely chosen to set the bandwidth at the summing node, but large offsets reduce the passband gain. We further notice that, for small bandwidth and given gain, $Q_{[1,2]}$
Figure C.7. Numerical example for lower band UWB applications. Subplot a) depicts the argument $|H_s(s)|$ of the transfer function of the output stage and shows a -3 dB bandwidth of 3 GHz (from $f_L = 2.3$ GHz to $f_H = 5.3$ GHz, round markers). Subplot b) shows the group delay. Within the bandwidth, the group delay exhibits a flat behavior of approximately ± 200 ps. The vertical sharp lines in the approximation are caused by the assumption of ideal inductors. Any degradation of the quality factor of the inductors $L_1$ and $L_2$ will reduce the notches to a non-infinite attenuation and smooth down the group delay function as illustrated by the continuous curve.

can be minimized. This enables minimum values for inductances and maximum values for capacitances. Both features are useful for an on-chip implementation to use small inductors having better quality factor and large capacitors, which are less prone to process variations. The numerical example of Section C.2.4 is identified by a square marker. The gray-shaded area in Fig. C.8 gives a representation of the model accuracy. The lightest area corresponds to an accuracy better than ±1 dB of the transfer function $\hat{H}_s(s)$ with respect to the exact solution between $f_L$ and $f_H$. Each darker level of gray corresponds to
a decrease of 1 dB in accuracy. The simple proposed model achieves a level of confidence of ±2 dB on most of the $Q_{[1,2]}$ and $\omega_{nH}/\omega_{nL}$ ratios.

![Figure C.8](image)

**Figure C.8.** Fractional bandwidth (thin contours, from 0.2 to 1.2), gain (thick contours, from -3 to 2 dB), ripple (dotted contours) of the output stage. Gray-shaded area represents the accuracy of the approximation developed in the previous sections; the lighter area delimits a model accuracy better than ±1 dB, each darker level indicates a step of 1 dB. The white area in the upper-right corner corresponds to inband gain ripples larger than 4 dB (> ±2 dB) and is out of interest. The numerical example of Section C.2.4 is identified by a square marker.

### C.3. Circuit Schematic

The equivalent schematic of the proposed circuit is depicted in Fig. C.9. The differential current sources of Fig. C.4 have been implemented by two differential pairs both driven by the same differential UWB signal $v_{in}$. Differential pairs are actually voltage-driven current sources with
C.4. Sensitivity Analysis to Passive Components

a non-flat transconductance $G_m(s)$. This implementation issue will be discussed later in Section C.5.2. A fully differential topology is preferred on chip because it permits a excellent rejection of the noise induced by the CMOS digital circuitry. Moreover, the current consumption of IR-UWB transmitters can be reduced by applying efficient duty-cycle schemes. During power-on and power-off phases, a differential topology is preferred since it reduces potential glitches that will corrupt or distort the output spectrum. Practically, the fully differential approach used to drive the output stage requires an additional choke inductor $L_{chk}$ in order to provide bias current to the differential pair loaded by $C_2$ and $L_2$.

The numerical example of Section C.2.4 resulted in a $L_2$ value that may be excessive for on-chip inductors. The main reason is that high value inductances have a reduced quality factor that decreases the attenuation at the notch frequency. Following Eq. C.13 and Eq. C.14, a way to reduce the inductance value is to decrease the load $R$. In the case of an off-chip 50-$\Omega$ load $R$, an additional impedance transformation network as shown in Fig. C.4 based on a LC circuit has been added to reduce the resistance value of the load $R$ down to approximately 30-$\Omega$. The ZTN has values for $L_{ZTN} = 1.0$ nH and $C_{ZTN} = 0.65$ pF. This will help to obtain a low frequency notch at the GPS frequency with an inductance value $L_2$ reduced from 9.1 nH to 5.5 nH. Other component values are now changed into $L_1 = 2.2$ nH, $C_1 = 0.4$ pF and $C_2 = 1.0$ pF. By using a low-pass topology for the ZTN, it is possible to further filter out odd harmonics that are generated by the non-linearity and the mismatch of the differential pair. Depending on the input signal frequency, mismatch in the differential pair creates $2^{nd}$ harmonics in the single-ended output signal, which are located between 6.2 GHz and 9.6 GHz.

C.4. Sensitivity Analysis to Passive Components

To ensure an optimum coverage of the defined UWB frequency band against process variations, a wider bandwidth is required. The typical $\pm 3-\sigma$ variations of the capacitance value is $\pm 12\%$, whereas inductance may change by $\pm 10\%$. These variations basically change the position of both the frequency notches and thus, the bandwidth and the center frequency of the current transfer function of the output stage. The -3 dB bandwidth is defined by $f_L$ on the lower end and $f_H$ for the highest frequency. The nominal frequency values for $f_L$ and $f_H$ are 2.3 GHz
Figure C.9. Simplified schematic of the output stage. Ideal current sources are implemented by MOS-transistors and the ZTN network consists in a LC-ladder impedance converter.

and 5.3 GHz, respectively, as depicted in Fig. C.7. A summary of the theoretical values of $f_L$, $f_H$ and the attenuation in the GPS frequency band ($A_{GPS}$), are reported in Table C.1 for the worst cases (minimum $f_H$ and maximum $f_L$). Non-ideal inductors with a quality factor of $Q_L = 15$ have been considered in this analysis. We notice that the chosen values ensure an optimal coverage of the lower UWB band against passive component variations. The gain at center frequency $|H_s(\omega_c)|$ remains unaffected. Moreover, a worst case signal rejection of 12.6 dB is attainable in the GPS band. The nominal rejection $A_{GPS}$ that has been theoretically calculated is approximately 20 dB. The output stage shows little dependence on process tolerances in the passband and the notch frequency. On the other hand, the bandwidth varies logically by 22%.

C.5. Measurements

The circuit has been fabricated with the 0.18 $\mu$m IBM BiCMOS7WL technology using exclusively CMOS transistors, on-chip inductors and metal-insulator-metal (MIM) capacitors provided by the standard library of the design kit. The circuit was simulated in the Cadence environment for analog circuits. The simulator Spectre was used to carry out various analyzes. A chip photograph of the fabricated test circuit is
Table C.1. Numerical investigation of worst-case process variations ($Q_L=15$)

| $\Delta C$ [%] | $\Delta L$ [%] | $f_L$ [GHz] | $f_H$ [GHz] | $A_{GPS}$ [dB] | $|H_s(\omega_c)|$ [dB] |
|----------------|----------------|-------------|-------------|----------------|----------------|
| -12            | -10            | 2.52        | 5.98        | 16.4           | 9.9            |
| 0              | 0              | 2.26        | 5.32        | 19.8           | 10             |
| +12            | +10            | 2.26        | 4.78        | 12.6           | 10.1           |

illustrated in Fig. C.10. The active area measures 700 $\mu$m by 300 $\mu$m and comprises four on-chip inductors. Values for $L_1 = 0.6$ nH, $C_1 = 300$ fF, $L_2 = 2.1$ nH, $C_2 = 1.5$ pF, $L_{chk} = 6$ nH and $L_{ZTN} = 0.7$ nH are affected by parasitics of the implementation and differ from the ones theoretically calculated, which are used as a starting point for the circuit synthesis. Optimal values can be easily found by inspection of the notches of the simulated transfer function.

Figure C.10. Chip photograph of the first IR-UWB output stage prototype, the overall chip size is less than 1 mm$^2$, whereas the effective chip area measures 700 $\mu$m x 300 $\mu$m.
C.5.1. Mixed-Mode S-Parameters

The circuit has been mounted on a high-frequency Roger RO3010 ($\epsilon_r = 10.2$) test substrate for measurements. The characterization of a 3-port balun network requires differential and common-mode signals. The generation of these modes of propagation is not straightforward with standard vector network analyzer (VNA), since it requires the use of hybrid couplers or power splitter/combiner baluns in the signal path. The latter devices introduce magnitude and phase imbalances in the signal and do not enable the characterization of mode conversion (differential to common-mode). Furthermore the calibration of a system with measurement baluns lacks of bandwidth and is prone to inaccuracies. To overcome these measurement issues, mixed-mode S-parameters are employed [152]. The mixed-mode S-parameters matrix for the considered 3-port circuit are given in the following equation:

\[
\begin{bmatrix}
   b_d \\
   b_c \\
   b_3
\end{bmatrix} =
\begin{bmatrix}
   S_{dd} & S_{dc} & S_{d3} \\
   S_{cd} & S_{cc} & S_{c3} \\
   S_{3d} & S_{3c} & S_{33}
\end{bmatrix}
\begin{bmatrix}
   a_d \\
   a_c \\
   a_3
\end{bmatrix},
\]

(C.15)

where indices $d$, $c$ and $3$ denote input differential-mode and common-mode (both defined by ports P1 and P2 in Fig. C.9) and single-ended output (port P3), respectively. $[a_n]$ are the normalized waves propagating in the forward (P1/P2 to P3) direction and $[b_n]$ represent the normalized waves propagating in the reverse direction. As shown in Fig. C.11, the various ratios of waves at the differential input port to the single-ended output port are measured by pair, and from these ratios the mixed-mode parameters $[S]$ can be calculated.

C.5.2. Transfer Characteristic

The measured magnitude of the $S_{3d}$ transfer function between the differential input and the single-ended output is shown in Fig. C.12 (thick continuous curve) and compared with the post-layout simulations (thick dashed curve). We first notice that the bandpass characteristic agrees well with the simulation. Both measured and simulated characteristics are obtained with the nominal supply voltage of 1.8 V and with a current consumption of 15 mA ($I_B = 7.5$ mA, see Fig. C.9). The latter bias condition fixes the transconductance of the differential pairs to obtain a positive gain $S_{3d}$. The active balun shows a gain of $5\pm2$ dB from 2.5 to 5 GHz (66% bandwidth).
When comparing the measurements in Fig. C.12 with the transfer function $H_s(s)$ developed in Section C.2.3, an important point has to be considered. It is actually impossible to directly measure the current transfer function $H_s(s)$. When using S-parameters, voltages waves are measured and therefore transconductance and transresistance are required at the input and the output of the passive network, respectively. The output transresistance is simply the resistive load of the VNA, which can be considered as an ideal wideband transresistance. On the other hand, the input transconductance $G_m(s)$ is implemented with transistors, as proposed in Fig. C.9. The latter components significantly influence the measurement by the addition of a pole resulting from the capacitive input of the active circuit and the source resistance of the VNA. This input capacitance $C_{in}$ mainly stems from the input gate (oxide and gate-source overlap), the Miller capacitance (gate-drain overlap) and the input pad. In the realized prototype, the low-pass pole on $G_m(s)$ is approximately $(R_sC_{in})^{-1} \approx 2$ GHz. A first way to increase the pole frequency is the reduction of the voltage gain at the transistor drains (nodes n1, n2 and summing nodes). Reducing the voltage gain lowers the input capacitance by a reduction of the Miller effect. The second observed difference between Fig. C.12 and Fig. C.7 is the absence of the high-frequency notch. This is caused by an excessive output admittance of the transconductance stage. An improvement can be obtained by the use of a cascode input stage. Cascode topologies have a much lower output conductance and a better reverse isolation. The benefits of such an input transconductance circuit are illustrated by the thin dashed line of Fig. C.12. The high frequency notch can be
better identified and the low frequency notch is more pronounced. Furthermore, the use of a cascode stage pushes the pole of the $G_m(s)$ function towards higher frequencies by reducing the Miller effect, leading to a flatter inband characteristic.

![Graph showing measured and simulated $S_{3d}$ transfer function.](image)

**Figure C.12.** Measured and simulated $S_{3d}$ transfer function (differential input to single-ended output).

The observed discrepancy between the simulated and the measured notch at 1.83 GHz is mainly caused by an inaccuracy in parasitic extraction process at the transistor level. To explain this deviation, a simulation excluding the extraction of circuit in the area of the differential pairs is given by the dash-dotted curve and exhibits a nearly perfect match in the notch frequency.

**C.5.3. Group Delay**

Figure C.13 shows the measured and simulated group delay between the input differential port and the single-ended output port. After de-embedding the microstrip lines of the measurement board, a flat group delay of $700 \pm 150$ ps is obtained within the band of interest from
3.1 to 4.8 GHz. Here again we only observe a discrepancy in the $L_2-C_2$ notch frequency caused by the same reason as explained in the previous Section C.5.2.

![Graph showing group delay S3d vs Frequency](image)

**Figure C.13.** Measured and simulated $S_{3d}$ group delay (differential input to single-ended output).

### C.5.4. Amplitude and Phase Imbalance

The circuit has been simulated to achieve an amplitude $|S_{31}| - |S_{32}|$ and phase imbalance $\angle S_{31} - \angle S_{32}$ in the passband of $\pm 1$ dB and $\pm 12^\circ$, respectively. Figure C.14 compares the measured and simulated performances of both amplitude and phase imbalance between each port of the differential input and the output. The circuit exhibits an amplitude imbalance between 0 and $-2$ dB, whereas the measured phase imbalance are shifted between 0 and $-20^\circ$. Small observed discrepancies are caused by mismatches in the input differential path and parasitics.
Figure C.14. Measured and simulated imbalance of the balun function. Both graphs are zoomed in the passband from 2.5 to 5.5 GHz.

C.5.5. Linearity

An in-depth analysis of the linearity is given in this section, where harmonic and intermodulation distortion measurements have been conducted with single- and two-tone test, respectively. Harmonic distortion (HD) of the output stage has been measured and compared with simulations in Fig. C.15. The input signal is a sine wave, whose frequency is set in the passband at 4 GHz; its power $P_{in}$ is swept from -30 to 0 dBm. The measured and simulated 1 dB compression points show an excellent match and are $CP_{1 dB} = -8$ dBm.

The most significant difference between simulations and measurements is observed for the 2nd harmonic ($f_{out} = 8$ GHz), which shows a measured value 12 dB larger than the simulated one. Part of this discrepancy (i.e. approximately 6 dB) can be explained by the difference between the measured and simulated attenuation at 8 GHz, as shown in Fig. C.12. The remaining error can partly be explained by mismatch and offsets effects at the differential input stage ($\pm 3$ dB obtained with
Monte-Carlo simulations). Transistor model and nonlinearity of the measuring equipment are also a causing an increase in HD. However, owing to its bandpass characteristic, the circuit exhibits a high harmonic intercept point $P_{\text{HDI}}$ around 14 dBm. This value could even be made higher, by setting the high frequency notch $f_{nH}$ at twice the center frequency of the passband.

The third order harmonic agrees well with the simulations. No intercept point can be extracted due to the high attenuation at 12 GHz.

![Harmonic Distortion Analysis](image)

**Figure C.15.** Linearity of the output stage. Harmonic distortion analysis using single-tone test at 4 GHz.

Intermodulation distortions are illustrated in Fig. C.16. Measurements have been conducted with a two-tone test around 4 GHz. The input-referred third order intercept point IIP3 has been measured at -1.5 dBm and shows a good match with simulations.
C.6. Applications

IR-UWB signalling schemes have been specified by the IEEE in 2007 [34]. This standard is subject to the same regulation masks that fixes strict limits in the emission of UWB signals by unlicensed devices [8]. One of the main issues in IR-UWB is the generation of short pulses having a well-defined and controllable spectrum content to comply with this regulation.

Spectrum shaping is illustrated in Fig. C.17. The plot compares the simulated power spectral densities (PSD) of a carrier based IR-UWB signal centered around 4 GHz, which is passed through the proposed circuit. Since the device provides gain, both input and output PSD’s have been normalized to the FCC emission limit of -41.3 dBm/MHz. The pulse length of 1 ns has been chosen so that the main lobe fits the FCC mask around the lower limit of 3.1 GHz. The energy contained in the sidelobes are determined by the pulse envelope. A rectangular envelope with 100 ps edges has been chosen to provide sufficient energy.

**Figure C.16.** Linearity of the output stage. Third order input intercept point analysis using two tones centered at 4 GHz and separated by 1 MHz.
in the FCC restricted band between 0.96 and 1.61 GHz. This represents clearly a worst case (dashed lines), since a sidelobe falls exactly in the restricted band. The continuous line depicts the output PSD of the same pulse passed through the proposed circuit. We observe that the energy in the restricted GPS band is reduced to comply with the UWB FCC spectrum mask (dash dotted line).

As illustrated in Fig. C.1, the proposed circuit can be inserted between a time-domain pulse generator, such as proposed in [86] and an UWB antenna. An accurate tuning of the frequency characteristic can be realized by replacing the fixed capacitors of the LC-based current inverters by voltage- or digitally-controlled capacitors. Furthermore, the antenna may help to further reduce the remaining sidelobes appearing in Fig. C.17. The advantage of the proposed passive section for an UWB output stage is that it can be directly implemented after any circuit using a differential output by simply replicating the differential pair output to drive the complementary current inverters.

![Simulated input and output power spectral densities (dashed and continuous) vs. FCC mask for indoor services (dash-dotted). For a fair comparison, power spectral densities are normalized to the maximum emission limit of -41.3 dBm/MHz.](image)

**Figure C.17.** Simulated input and output power spectral densities (dashed and continuous) vs. FCC mask for indoor services (dash-dotted). For a fair comparison, power spectral densities are normalized to the maximum emission limit of -41.3 dBm/MHz.
C.7. Conclusions

The deployment of UWB communication systems using short-duration pulses has to cope with many challenges, such as the generation of signals in the time domain complying with regulation masks in the frequency domain. This paper presents the design and the characterization of a novel balun output stage topology for carrier-based IR-UWB. The circuit is based on a double current inversion. It features the conversion of the on-chip differential signal into a single-ended one over a wide bandwidth, while enabling the spectrum to be shaped to comply with the FCC UWB regulation. This frequency filtering function is particularly useful to shape the ill-defined defined spectrum of IR-UWB signals that are generated in time domain without any frequency up-conversion or accurate envelope generation.
D

AD/DA Calibrated Free-running VCO

D.1. Principles

The classical approach for LO generation is based on PLL frequency synthesizers [189]. The core of a PLL consists of a voltage-controlled oscillator (VCO). Its output frequency is internally divided and compared to a stable external reference by a phase frequency detector (PFD), which generates an error signal in order to adjust the control voltage of the VCO.

In the case of applications requiring relaxed frequency accuracies, the whole feedback mechanism can be considered as an expensive way to generate a constant VCO control voltage. A more power-efficient method to generate RF signal with a desired frequency is the direct application of a calibrated control voltage to the VCO with the help of digital-to-analog converter (DAC), as shown in Fig. D.1-a. The latter replaces divider, charge pump, PFD and loop filter, which are no longer required and can therefore be shut down (“off” blocks identified by gray color). This is motivated by the fact that the power requirements of a DAC are much less critical than those needed in the feedback chain of a PLL, running at speeds up to the desired RF frequency (eg. RF prescaler).

The required control voltage that has to be applied to the VCO for a given LO output frequency is obtained after a calibration procedure, which is performed with the help of a conventional N-integer PLL. During the calibration phase, illustrated in Fig. D.1-b, the DAC
Figure D.1. High-level schematic showing the principle of the self-calibrated oscillator approach for a RF frequency of 4.2 GHz; a) LO generation by DAC-driven VCO control voltage $V_c$; b) calibration phase by means of a PLL.

is disconnected from the PLL by a switch (single-pole single-throw - SPST). The PLL is put in its normal operating mode until it locks. After that, the VCO control voltage $V_c$ is read by an analog-to-digital converter (ADC) and stored in the memory of a micro-controller ($\mu$C). Both ADC and DAC have a sufficient resolution (12-bit) to achieve the needed accuracy. This procedure is repeated for different divider settings in order to get a digital value for every required LO frequency. In the proposed topology, the three LO frequencies used for the three different channels A, B and C requires the storage of three different values $V_c$ (Fig. 4.4). The frequencies used for the BFSK modulation at the transmitter can be calibrated in a similar way. After the calibration, the
VCO is directly controlled by the DAC retrieving the digitally stored values, as shown in Fig. D.1-a. Every component of the PLL, except the VCO, is disabled at this stage.

The calibration procedure can be repeated each fraction of seconds to compensate potential VCO frequency drifts caused by temperature and supply voltage variations. Therefore, a full calibration only has a minimal impact on the average power consumption but depends mainly from the ADC acquisition speed, the PLL settling time and the number of frequencies to calibrate. The calibration duration can be estimated to a few microseconds with the proposed solution.

The use of IR-UWB wideband signals enables a relaxed tolerance against small LO frequency offsets. In order to find out the accuracy requirements the open loop oscillator must fulfill, a simulation was performed in Section 3.10.3 (Fig. 3.31) showing the impact of LO inaccuracies on the bit-error rate of BFSK modulation scheme. Experimental results were described in Fig. 7.17; they show that cumulated frequency offset smaller than 40 MHz between transmitter carrier and receiver LO can be tolerated in order to keep the BER degradation smaller than 1.4 dB (simulated 1 dB).

D.2. Results

D.2.1. Frequency Accuracy

Despite the use of a calibration, several factors can lead to inaccuracies in the setting of the VCO frequency in open loop. Noise on supply and ground lines are a major concern. To minimize these effects, the VCO has been built in a fully-balanced topology, which helps to reduce the effect of supply and ground potential fluctuations. The DAC/ADC subsystem is also based on a pseudo-differential topology to avoid errors caused by fluctuations of the common mode of the control voltage. Intrinsic characteristics of the DAC, such as non-linear behavior, affect the accuracy of the calibration as well.

The accuracy of the proposed calibration procedure is shown in Fig. D.2. The open loop average frequency is compared to the closed loop frequency. The absolute deviation could be kept within 20 MHz on the used 1.5 GHz IR-UWB bandwidth. This corresponds to a frequency accuracy of ±1.3% on the entire tuning range.
Figure D.2. Measured open loop oscillator frequency accuracy showing the difference against PLL frequency in closed loop. This error corresponds to a frequency accuracy of $\pm 1.3\%$ on the entire tuning range.

D.2.2. Spectral Output

The comparison between the spectra of the VCO output in open loop and closed loop modes is depicted in Fig. D.3. The PLL synthesizer’s output spectrum shows a sharp peak at the desired frequency due to the close-in phase noise reduction provided by the feedback loop. At frequency offset larger than the PLL bandwidth (here approx. 30 MHz), the VCO phase noise dominates and the two spectra are similar. The slightly higher phase noise in closed loop at large frequency offsets originates from the additive PLL divider and charge pumps noise.

In open loop, the absence of a stable reference causes the instantaneous frequency to fluctuate around the center frequency, which results in a spectrum bandwidth of approximately 10 MHz. This can be seen as a long-term fluctuation that adds to the average measured offset. The resulting instantaneous frequency offset stays however within the 50 MHz range required for a BER degradation smaller than 1 dB (Fig. 7.17).

An additional benefit of the open-loop operation is the strong reduction of spurs generated by the phase comparison with the PLLs reference frequency. The control voltage provided by the DAC is stable and does not modulate the VCO output frequency.
D.2. Results

D.2.3. Frequency Settling Time

The main interest of the approach is the dynamic behavior of the frequency setting. Due to the lack of feedback loop, the slow locking time associated with PLL settlement is eliminated. Therefore, the startup time required by the LO is greatly reduced. This adds the capability of an efficient duty-cycled LO operation.

The instantaneous frequency was determined in MATLAB by measuring the time between the zero-crossings of the LO signal. Compared to the open loop oscillator, the PLL takes a significant amount of time to lock at the desired frequency. Its high spectral purity comes at the cost of a slow response. The PLL requires about 750 ns to lock from off state, whereas the directly controlled VCO oscillates at the desired frequency after only 20 ns. Similar work have been already used to reduce the settling time of a conventional PLL by setting the VCO control voltage to a value close to the expected voltage in locked state [190].
Appendix D: AD/DA Calibrated Free-running VCO

Figure D.4. Measured settling times of a) PLL in closed loop and b) free-running oscillator with calibrated control voltage (inset figure zooms between 0 and 50 ns). The instantaneous frequency is extracted from the signal’s zero-crossings.

D.3. Power Consumption Analysis

A numerical example is shown in this section for the sake of illustrating the power reduction which is achievable in an IR-UWB RF front-end using the proposed solution.

In this example, a generic pulse repetition rate of 1 MHz is assumed. An integrate-and-dump scheme using integration time $T_i = 20$ ns is considered as an optimum solution to detect carrier-based IR-UWB pulses corrupted by multipath reflections (see Section 3.8). The time required by a baseband unit using three consecutive Early/In-Time/Late windows of duration $T_i$ to track the pulse stream is $3 \cdot T_i = 60$ ns.
During this period of time, a stable and accurate LO signal must be provided to down-convert an RF pulse around DC. An overview of the timing and power consumption values assumed in this analysis is given in Table I.

For each transmitted pulse, the RF front-end (LNA/mixer) and the VCO must only be enabled during 80 ns including startup time, while the PLL requires 750 ns in order to lock. This leads to following average power consumption $P_{CL}$ for the closed loop oscillator:

$$P_{CL} = 14.5 \text{ mW} \cdot \frac{80 \text{ ns}}{1000 \text{ ns}} + 47 \text{ mW} \cdot \frac{(750 + 60) \text{ ns}}{1000 \text{ ns}} + 2.8 \text{ mW} \approx 42 \text{ mW}. \quad (D.1)$$

Using the new approach, the fast startup time of the open loop oscillator allows much longer idle periods of the LO section. In addition to that, its power consumption is lower since the only active high-frequency device is the VCO. On the other hand, the power required by the added components such as the DAC must be added to the calculation. Altogether the new approach leads to following average power consumption $P_{OL}$:

$$P_{OL} = (14.5 + 20) \text{ mW} \cdot \frac{80 \text{ ns}}{1000 \text{ ns}} + (2.8 + 2.7) \text{ mW} \approx 8.3 \text{ mW}. \quad (D.2)$$
This result assumes synchronized transmitter and receiver and does not include the power required by the calibration periods. Because of their very short duration compared to the required frequency of calibrations, their influence on the average power consumption is minimal and has therefore been neglected.

D.4. Conclusions

In this Appendix, an open loop oscillator approach including a self-calibration technique has been compared to a classical PLL solution in terms of spectral output, accuracy, settling time and power consumption. The method consists of the generation of a LO signal at the desired frequency through the direct control of the VCO input voltage. This control voltage is provided by a DAC, whose output has been first calibrated by a PLL in closed loop.

The presented oscillator features a greatly reduced settling time compared to the classical PLL approach with an accuracy of 20 MHz, which is sufficient to meet the requirements for non-coherent IR-UWB receiver.

A numerical example based upon practical values of a RF front-end realized at the ETH Zurich showed how the proposed approach can contribute to reduce the power requirements of an IR-UWB communication system to a fifth of its original value, with negligible loss on its performance.
Index

ADC, 23
AGC, 178
AGN, 73
ASIC, 252
AWGN, 46

BER, 4
BFSK, 44, 53, 121
BPPM, 47
BPSK, 43
BWA, 13

CB-UWB, 3
CEPT, 9
CM, 31
CMOS, 3
CP, 140
CW, 5

DAA, 12
DSB, 51
DSBC, 38
DSP, 57

EC, 9
Ecma, 21
EIRP, 13
ESD, 37
ETSI, 9
EU, 9

FBW, 179
FCC, 8
FDMA, 50
FLEX, 56

HDR, 10
HPRM, 106

I&D, 45, 75
I&Q, 62
IEEE, 19
IMP, 95
IPI, 48, 79
IR-UWB, 3
ISI, 41

LDC, 12
LDR, 10
LF, 140
LM, 81
LNA, 56, 179
LO, 51
LOS, 25, 31
LPRM, 105
LTI, 29

MAC, 21
MUX, 169

NF, 82
NLOS, 31

OOK, 45
OPET, 3
OTA, 118

PA-FLL, 131
pdf, 63
PFD, 140
PHY, 21
PLL, 139
PMCW, 106
POCSAG, 56
PPM, 43
PRR, 15
PSD, 8
PSF, 189
PVT, 149
RAKE, 23
RBW, 42
RFIC, 3
RMS, 17
S&H, 57
SA, 42
SNR, 44, 179
SRD, 12
SSB, 50
UWB, 2
VCO, 56
VGA, 57, 178, 224
VHDR, 10

WMAN, 23
WPAN, 4
ZIF, 50
Bibliography


Appendix D: AD/DA Calibrated Free-running VCO


[17] Electronic Communications Committe (ECC), “ECC Decision of 1 December 2006 on supplementary regulatory provisions to Decision ECC/DEC/(06)04 for UWB devices using mitigation techniques.” ECC/DEC/(06)12, Oct. 2008. 2.3.6

[18] Electronic Communications Committee (ECC), “Technical requirements for UWB DAA (detect and avoid) devices to ensure
the protection of radio-location services in the bands 3.1-3.4 GHz and 8.5-9 GHz and BWA terminals in the band 3.4-4.2 GHz.”
Kristiansand, June 2008. 2.3.7

[19] CEPT - Electronic Communication Committee (ECC), “ECC Decision of 1 Decembre 2006 on the harmonised conditions for devices using Ultra-Wideband (UWB) technology with Low Duty Cycle (LDC) in the frequency band 3.4-4.8 GHz.” Doc. ECC/DEC/(06)12, December 1, 2006. 2.3.7

[20] Electronic Communications Committee (ECC), “Technical requirements for UWB LDC devices to ensure the protection of FWA systems.” Nicosia, December 2006. 2.3.7


Appendix D: AD/DA Calibrated Free-running VCO


[34] IEEE Standard for Information Technology, “IEEE 802 Part 15.4: Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks (WPANs), Amendment 1: Add Alternate PHYs,” Aug. 2007. 2.6.4, C.6


\textsuperscript{1}www.uwbforum.org expired on Mar. 08, 2008.


[72] M. B. Pursley, Introduction to Digital Communications. Prentice Hall, 2005. 3.7.4


[85] Samyoung Electronics, “Preliminaty Datasheet, UWB Pre-select Filter, Part No. SF39A05218NX.” 3.11.4


[89] IEEE Computer Society, “Standard for Information Technology, Specific Requirements, Part 15.4: Wireless Medium Access Control (MAC) and Physical Layer (PHY): Specifications for Low-Rate Wireless Personal Area Networks (WPANs), Amendment 1: Add Alternate PHYs 802.15.4a,” 31 August 2008. 4.2.2, 4.2.3, 4.4.2


4.6. Conclusions


[115] IBM Microelectronics Division, “BiCMOS 7WL Design Manual,” June 20, 2006. 5.3.3, 6.2.4


D.4. Conclusions 361


Appendix D: AD/DA Calibrated Free-running VCO

*Microwave Conference*, pp. 1449–1452, 10–15 Sept. 2006. 6.5.5, 6.5.6


[154] V. Sze and A. P. Chandrakasan, “A 0.4-V UWB baseband processor,” in *Proc. ISLPED’07*, (Portland, OR, USA), pp. 262–267, Aug. 2007. 7.1, 7.9, 7.1


D.4. Conclusions


Technical Papers Solid-State Circuits Conference ISSCC. 2005
IEEE International, pp. 442–608, 10 Feb. 2005. 7.9, 7.1


Curriculum Vitae

Personal Data

David Barras
Born October 17, 1972 in Sierre (VS), Switzerland
Citizen of Chermignon (VS), Switzerland

Education and Professional Experience

2003-2010
PhD studies in Information Technology and Electrical Engineering (Dr. sc. ETH Zurich) at ETHZ Zurich, Switzerland

2001-2003
Freelance RF engineer, Zurich, Switzerland and part-time research and teaching assistant at the Electronics laboratory, ETH Zurich, Switzerland

1997-2001
Senior RF engineer, Head of “Hardware Group” of “Business Unit Swatch”, ETA S. A., Grenchen (SO), Switzerland

1992-1997
MSc studies in Electrical Engineering (EE) at EPF Lausanne, Switzerland

List of Publications and Conferences

As first author:


As co-author:


D.4. Conclusions


