#### **ETH** zürich

# RISC-V Meets 22FDX: an Open Source Ultra-low Power Microcontroller Platform for Advanced FDSOI Technologies

Presentation

Author(s): Schiavone, Pasquale; Charagulla, Sanjay; Teepe, Gerd; <u>Benini, Luca</u>; PULP team

Publication date: 2018-05-09

Permanent link: https://doi.org/10.3929/ethz-b-000311769

Rights / license: In Copyright - Non-Commercial Use Permitted





# **RISC-V Meets 22FDX: an Open Source Ultra-low Power Microcontroller Platform for Advanced FDSOI Techonologies**

# <u>Pasquale Davide Schiavone<sup>1</sup></u>, <u>Sanjay Charagulla<sup>3</sup></u>, Gerd Teepe<sup>3</sup>, Luca Benini<sup>1,2</sup> and the PULP team

<sup>1</sup>Integrated System laboratory, ETH, Zurich, Switzerland <sup>2</sup>Energy Efficient Embedded Systems Laboratory, University Of Bologna, Bologna, Italy <sup>3</sup>GLOBALFOUNDRIES



<sup>1</sup>Department of Electrical, Electronic and Information Engineering

> ETH zürich <sup>2</sup>Integrated Systems Laboratory

09.05.2018

### Near Sensor (aka Edge) Processing



### **PULP is Open-Source & Free**

#### Releases



#### February 2016

First release of **PULPino**, our single-core microcontroller



#### May 2016 Toolchain and com

Toolchain and compiler for our RISC-V implementation (**RI5CY**), DSP extensions

3

August 2017 PULPino updates, new cores Zero-riscy and Microriscy, FPU, toolchain updates



Begin February 2018 PULPissimo ARIANE: 64-bit RISC-V core



End February 2018 OPEN-PULP

#### **Community Contributions**

#### June 2017 Porting of Verilator and BEEBS benchmarks to PULPino https://github.com/embecosm/ri5cy September 2017 Porting of ARM CMSIS to PULPino https://github.com/misaleh/CMSIS-DSP-PULPino November 2017 Numerous Bug fixes to RiscV in PULPino https://github.com/pulp-platform/riscv

#### December 2017

**STING**: Open-Source Verification Environment for PULPino

http://valtrix.in/programming/running-sting-on-pulpino

#### Download PULP @ https://github.com/pulp-platform

Davide Schiavone - ETH Zurich

3

4

### **Quentin: GF22FDX PULPissimo Implementation**

- RISC-V based advanced microcontroller
  - 512kB of L2 Memory
  - 16kB of energy efficient latch-based memory (L2 SCM BANK)
- Rich set of peripherals:
  - QSPI (up to 280 Mbps)
  - HyperRam + HyperFlash (up to 100 MB/s)
  - Camera Interface (up to 320x240@60fps)
  - I2C, I2S (up to 4 digital microphones)
  - JTAG (Debug), GPIOs,
  - Interrupt controller, Bootup ROM
- Autonomous IO DMA Subsystem (µDMA)
- Power management
  - 2 low-power FLLs (IO, SoC, ...)





### Efficient I/O subsystem: the µDMA



- Transfers bandwidth close to the physical limit of the architecture
- A single channel can saturate the memory port (can work at full BW)
- Bandwidth is not affected by peripherals buffer size



### **Efficient Interconnect**



Interleaving to reduce contention: 4 Banks

2 Banks non interleaved for private data and code





### **RI5CY Processor**

- 4-stage pipeline
  - RV32IMFCXpulp
  - 70K GF22 nand2 equivalent
    gate (GE) + 30KGE for FPU
  - Coremark/MHz 3.19
- Includes various extensions
  - pSIMD
  - Fixed point
  - Bit manipulations
  - HW loops



- Floating Point Unit:
  - IEEE 754 single precision
    - Iterative DIV/SQRT (7 cycles)
    - Pipeline MAC, MUL, ADD, SUB, Cast
    - Single cycle load, store, min, max, cmp etc

#### Thanks for all the external users bugs report!



### **Implementation Results**



#### **Quentin SoC layout**



### **Estimated Energy Efficiency**



Total Power @ 350 MHz, 0.65V = 8,5 mW



<u>1.4x better performance (350MHz vs. 250 MHz)</u> <u>4x better energy efficiency (40 vs. 14 MOPS/mW)</u> Than our previous design in 40nm technology



### **Execution on Energy-Efficient SCM**



Total Power @ 350 MHz, 0.65V = 5,5 mW



**<u>1.5x better energy efficiency</u>** than execution from SRAM

#### **GF – Global Presence in Semiconductor Manufacturing**



**TECHNOLOGY NODES** 

| 14, 12, 7nm               | 28, 22,12nm          | 90–22nm              | 180-22nm             | 180-40nm                   | 350–90nm       |
|---------------------------|----------------------|----------------------|----------------------|----------------------------|----------------|
| CAPACITY (WAFERS / MONTH) |                      |                      |                      |                            |                |
| Up to 60k<br>(300mm)      | Up to 80k<br>(300mm) | Up to 20k<br>(300mm) | Up to 85k<br>(300mm) | 68k (300mm)<br>93k (200mm) | 40k<br>(200mm) |
|                           |                      |                      |                      |                            |                |

Up to 10M Waters / Year 200mm equivalents

### GF Has the Industry's Best Dual-Track Roadmap





### 22FDX<sup>®</sup> for IoT, Automotive and RF

#### Compelling

- FinFET-like performance for customer who still want to use planar devices
- Ultra-low voltage (0.4V)
- Ultra-low leakage (1pA/µm)
- MRAM integration for IoT and Auto-MCU
- Automotive grade 2 and 1, mmWave / Radar
- Benchmark RF performance (>400GHz), and PA integration

#### Relevant

- IoT ARM, RISC-V processors for NB-IoT and AI ML functions
- Automotive ADAS /Vision, Infotainment, Body Electronics MCU, Radar
- RF <6GHz: Connectivity (BLE, Wi-Fi, Zigbee), Cellular (3G, 4G LTE, 5G)</p>
- RF mmWave >26GHz, 5G infrastructure, ADC/DAC integration

### **FDXcelerator**<sup>™</sup> Accelerating RISC-V Developers and Partners

Reduce time to market and facilitate FDX<sup>™</sup> SoC product design

- Enabling Universities driving RISC-V innovations
  - ETH-Zurich / University of Bologna PULP Architecture
  - Berkeley Labs, IIT Chennai
- SiFive Core IP targeting Data Center, ML, Automotive and Embedded applications
  - E31, E51 Configurable RISC-V Cores
- Reduced Energy Microsystems Drones, Robotics, Camera Applications
  - RISC-V IP hardware validated on 22FDX platforms
  - 32bit, 64bit Cores with Neural network IP solutions
- ANDES Cores for IoT, RF Connectivity applications
  - 32bit IP cores for 22FDX platform
  - Power efficient, smaller foot print designs



### Conclusion

- Challenges for next generations IoT End-Nodes:

  - Performance → FDX 22nm technology offers HP at low VDD
  - PVT Variations → Body Biasing
- We presented Quentin: FDX 22 Implementation of PULPissimo opensource architecture
  - Optimized IO subsystem for efficient data transfers
  - Processor optimized for energy efficient near-sensor data analytics
  - Up to 500 MOPS @ 0.8V
  - 45 MOPS/mW @350 MOPS, 0.65V
- First step towards silicon-qualified free-open-source RISC-V IPs on GF FDX22 process.

