Oscar Castañeda Fernández
Loading...
Last Name
Castañeda Fernández
First Name
Oscar
ORCID
Organisational unit
09695 - Studer, Christoph / Studer, Christoph
38 results
Search Results
Publications1 - 10 of 38
- VLSI Design of a 3-bit Constant-Modulus Precoder for Massive MU-MIMOItem type: Conference Paper
IEEE International Symposium on Circuits and Systems (ISCAS). Proceedings, 27–30 May 2018, Florence, ItalyCastañeda Fernández, Oscar; Jacobsson, Sven; Durisi, Giuseppe; et al. (2018)Fifth-generation (5G) cellular systems will build on massive multi-user (MU) multiple-input multiple-output (MIMO) technology to attain high spectral efficiency. However, having hundreds of antennas and radio-frequency (RF) chains at the base station (BS) entails prohibitively high hardware costs and power consumption. This paper proposes a novel nonlinear precoding algorithm for the massive MU-MIMO downlink in which each RF chain contains an 8-phase (3-bit) constant-modulus transmitter, enabling the use of low-cost and power-efficient analog hardware. We present a high-throughput VLSI architecture and show implementation results on a Xilinx Virtex-7 FPGA. Compared to a recently-reported nonlinear precoder for BS designs that use two 1-bit digital-to-analog converters per RF chain, our design enables up to 3.75 dB transmit power reduction at no more than a 2.7x increase in FPGA resources. - A 283 pJ/b 240 Mb/s Floating-Point Baseband Accelerator for Massive MU-MIMO in 22FDXItem type: Conference Paper
ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC)Castañeda Fernández, Oscar; Benini, Luca; Studer, Christoph (2022)We present PULPO, a floating-point baseband-processing accelerator for massive multi-user multiple-input multiple-output (MU-MIMO) basestations (BSs). PULPO accelerates matrix-vector products, not only with a matrix but also with its Hermitian, as well as affine transforms and nonlinear projections used in iterative algorithms that outclass traditional linear methods in various applications. PULPO is integrated in a system-on-chip (SoC) with a tight integration to the system's data memory, facilitating data exchange and co-operation with 8 RISC-V cores. The fabricated accelerator achieves comparable efficiency as recently-proposed fixed-point baseband processors, while eliminating the burdens associated with fixed-point design, thus simplifying massive MU-MIMO BS development. - PPAC: A Versatile In-Memory Accelerator for Matrix-Vector-Product-Like OperationsItem type: Conference Paper
2019 IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP)Castañeda Fernández, Oscar; Bobbett, Maria; Gallyas-Sanhueza, Alexandra; et al. (2019)Processing in memory (PIM) moves computation into memories with the goal of improving throughput and energy-efficiency compared to traditional von Neumann-based architectures. Most existing PIM architectures are either general-purpose but only support atomistic operations, or are specialized to accelerate a single task. We propose the Parallel Processor in Associative Content-addressable memory (PPAC), a novel in-memory accelerator that supports a range of matrix-vector-product (MVP)-like operations that find use in traditional and emerging applications. PPAC is, for example, able to accelerate low-precision neural networks, exact/approximate hash lookups, cryptography, and forward error correction. The fully-digital nature of PPAC enables its implementation with standard-cell-based CMOS, which facilitates automated design and portability among technology nodes. To demonstrate the efficacy of PPAC, we provide post-layout implementation results in 28nm CMOS for different array sizes. A comparison with recent digital and mixed-signal PIM accelerators reveals that PPAC is competitive in terms of throughput and energy-efficiency, while accelerating a wide range of applications and simplifying development. - VLSI Designs for Joint Channel Estimation and Data Detection in Large SIMO Wireless SystemsItem type: Journal Article
IEEE Transactions on Circuits and Systems I: Regular PapersCastañeda Fernández, Oscar; Tom Goldstein; Studer, Christoph (2018)Channel estimation errors have a critical impact on the reliability of wireless communication systems. While virtually all existing wireless receivers separate channel estimation from data detection, it is well known that joint channel estimation and data detection (JED) significantly outperforms conventional methods at the cost of high computational complexity. In this paper, we propose a novel JED algorithm and corresponding VLSI designs for large single-input multiple-output (SIMO) wireless systems that use constant-modulus constellations. The proposed algorithm is referred to as PRojection Onto conveX hull (PrOX) and relies on biconvex relaxation (BCR), which enables us to efficiently compute an approximate solution of the maximum-likelihood JED problem. Since BCR solves a biconvex problem via alternating optimization, we provide a theoretical convergence analysis for PrOX. We design a scalable, high-throughput VLSI architecture that uses a linear array of processing elements to minimize hardware complexity. We develop corresponding field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) designs, and we demonstrate that PrOX significantly outperforms the only other existing JED design in terms of throughput, hardware-efficiency, and energy-efficiency. - Jammer Mitigation via Beam-Slicing for Low-Resolution mmWave Massive MU-MIMOItem type: Journal Article
IEEE Open Journal of Circuits and SystemsMarti, Gian; Castañeda Fernández, Oscar; Studer, Christoph (2021)Millimeter-wave (mmWave) massive multi-user multiple-input multiple-output (MU-MIMO) promises unprecedented data rates for next-generation wireless systems. To be practically viable, mmWave massive MU-MIMO basestations (BSs) must rely on low-resolution data converters which leaves them vulnerable to jammer interference. This paper proposes beam-slicing, a method that mitigates the impact of a permanently transmitting jammer during uplink transmission for BSs equipped with low-resolution analog-to-digital converters (ADCs). Beam-slicing is a localized analog spatial transform that focuses the jammer energy onto few ADCs, so that the transmitted data can be recovered based on the outputs of the interference-free ADCs. We demonstrate the efficacy of beam-slicing in combination with two digital jammer-mitigating data detectors: SNIPS and CHOPS. Soft-Nulling of Interferers with Partitions in Space (SNIPS) combines beam-slicing with a soft-nulling data detector that exploits knowledge of the ADC contamination; projeCtion onto ortHOgonal complement with Partitions in Space (CHOPS) combines beam-slicing with a linear projection that removes all signal components co-linear to an estimate of the jammer channel. Our results show that beam-slicing enables SNIPS and CHOPS to successfully serve 65% of the user equipments (UEs) for scenarios in which their antenna-domain counterparts that lack beam-slicing are only able to serve 2% of the UEs. - A Jammer-Mitigating 267 Mb/s 3.78mm² 583 mW 32 x 8 Multi-User MIMO Receiver in 22FDXItem type: Other Conference Item
2024 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits)Bucheli, Florian; Castañeda Fernández, Oscar; Marti, Gian; et al. (2024)We present the first multi-user (MU) multiple-input multiple-output (MIMO) receiver ASIC that mitigates jamming attacks. The ASIC implements a recent nonlinear algorithm that performs joint jammer mitigation (via spatial filtering) and data detection (using a box prior on the data symbols). Our design supports 8 user equipments (UEs) and 32 basestation (BS) antennas, QPSK and 16-QAM with soft-outputs, and enables the mitigation of single-antenna barrage jammers and smart jammers. The fabricated 22 nm FD-SOI ASIC includes preprocessing, has a core area of 3.78 mm² , achieves a throughput of 267 Mb/s while consuming 583 mW, and is the only existing design that enables reliable data detection under jamming attacks. - FPGA design of low-complexity joint channel estimation and data detection for large SIMO wireless systemsItem type: Conference Paper
2017 IEEE International Symposium on Circuits and Systems (ISCAS)Castañeda Fernández, Oscar; Goldstein, Tom; Studer, Christoph (2017)Joint channel estimation and data detection (JED) enables near-optimal error-rate performance in realistic wireless communication systems that suffer from channel estimation errors. In this paper, we propose a new JED algorithm and a corresponding FPGA design for large single-input multiple-output (SIMO) wireless systems that use constant-modulus constellations. Our algorithm, referred to as PrOX (short for PRojection Onto conveX hull), relies on biconvex relaxation (BCR) in order to efficiently compute an approximate solution of the maximum-likelihood JED problem that exhibits prohibitive complexity. PrOX is a simple and hardware-friendly algorithm that achieves near-optimal error-rate performance for a wide-range of system configurations. To demonstrate the efficacy of PrOX, we develop a scalable VLSI architecture and present reference implementation results on a Xilinx Virtex-7 FPGA. Compared to a recently-reported reference JED design, PrOX achieves 3x higher throughput, 20x better hardware-efficiency (in terms of throughput per look-up tables), and 8x improved energy-efficiency. - Data Detection in Large Multi-Antenna Wireless Systems via Approximate Semidefinite RelaxationItem type: Journal Article
IEEE Transactions on Circuits and Systems I: Regular PapersCastañeda Fernández, Oscar; Goldstein, Tom; Studer, Christoph (2016)Practical data detectors for future wireless systems with hundreds of antennas at the base station must achieve high throughput and low error rate at low complexity. Since the complexity of maximum-likelihood (ML) data detection is prohibitive for such large wireless systems, approximate methods are necessary. In this paper, we propose a novel data detection algorithm referred to as Triangular Approximate SEmidefinite Relaxation (TASER), which is suitable for two application scenarios: i) coherent data detection in large multi-user multiple-input multiple-output (MU-MIMO) wireless systems and ii) joint channel estimation and data detection in large single-input multiple-output (SIMO) wireless systems. For both scenarios, we show that TASER achieves near-ML error-rate performance at low complexity by relaxing the associated ML-detection problems into a semidefinite program, which we solve approximately using a preconditioned forward-backward splitting procedure. Since the resulting problem is non-convex, we provide convergence guarantees for our algorithm. To demonstrate the efficacy of TASER in practice, we design a systolic architecture that enables our algorithm to achieve high throughput at low hardware complexity, and we develop reference field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) designs for various antenna configurations. - Finite-Alphabet Wiener Filter Precoding for mmWave Massive MU-MIMO SystemsItem type: Conference Paper
2019 53rd Asilomar Conference on Signals, Systems, and ComputersCastañeda Fernández, Oscar; Jacobsson, Sven; Durisi, Giuseppe; et al. (2019)Power consumption of multi-user (MU) precoding is a major concern in all-digital massive MU multiple-input multiple-output (MIMO) base-stations with hundreds of antenna elements operating at millimeter-wave (mmWave) frequencies. We propose to replace part of the linear Wiener filter (WF) precoding matrix by a finite-alphabet WF precoding (FAWP) matrix, which enables the use of low-precision hardware that consumes low power and area. To minimize the performance loss of our approach, we present methods that efficiently compute FAWP matrices that best mimic the WF precoder. Our results show that FAWP matrices approach infinite-precision error-rate and error-vector magnitude performance with only 3-bit precoding weights, even when operating in realistic mmWave channels. Hence, FAWP is a promising approach to substantially reduce power consumption and silicon area in all-digital mmWave massive MU-MIMO systems. - A 354 Mb/s 0.37 mm2 151 mW 32-User 256-QAM Near-MAP Soft-Input Soft-Output Massive MU-MIMO Data Detector in 28nm CMOSItem type: Journal Article
IEEE Solid-State Circuits LettersCastañeda Fernández, Oscar; Studer, Christoph; Jeon, Charles (2019)This letter presents a novel data detector application-specific integrated circuit (ASIC) for massive multiuser multiple-input multiple-output (MU-MIMO) wireless systems. The ASIC implements a modified version of the large-MIMO approximate message passing algorithm (LAMA), which achieves near-optimal error-rate performance (i) under realistic channel conditions and (ii) for systems with as many users as base-station (BS) antennas. The hardware architecture supports 32 users transmitting up to 256-QAM simultaneously and in the same frequency band, and provides soft-input soft-output capabilities for iterative detection and decoding. The fabricated 28nm CMOS ASIC occupies 0.37 mm2 , achieves a throughput of 354 Mb/s, consumes 151 mW, and improves the SNR by more than 11 dB compared to existing data detectors in systems with 32 BS antennas and 32 users for realistic wireless channels. In addition, the ASIC achieves 4x higher throughput per area than a recently proposed message-passing detector.
Publications1 - 10 of 38