A Hilbert-based method for processing respiratory timeseries

In this technical note, we introduce a new method for estimating changes in respiratory volume per unit time (RVT) from respiratory bellows recordings. By using techniques from the electrophysiological literature, in particular the Hilbert transform, we show how we can better characterise breathing rhythms, with the goal of improving physiological noise correction in functional magnetic resonance imaging (fMRI). Specifically, our approach leads to a representation with higher time resolution and better captures atypical breathing events than current peak-based RVT estimators. Finally, we demonstrate that this leads to an increase in the amount of respiration-related variance removed from fMRI data when used as part of a typical preprocessing pipeline. Our implementation will be publicly available as part of the PhysIO toolbox, which is distributed as part of the open-source TAPAS package (translationalneuromodeling.org/tapas). Highlights We introduce a new estimator for respiratory volume per unit time from respiratory recordings. We demonstrate how this is able to accurately characterise atypical breathing events. This removes significantly more variance when used as a confound regressor for fMRI data. Our implementation will be included in PhysIO, released as part of TAPAS: translationalneuromodeling.org/tapas.


Introduction
There has been much recent interest in the impact of global artefacts on fMRI data [Schölvinck et al. 2010;Satterthwaite et al. 2013;Burgess et al. 2016;Ciric et al. 2017;Murphy and Fox 2017;Liu et al. 2018], with many studies finding a link between physiological processes-particularly breathing-and these global signals [Power et al. 2017a;b;Byrge and Kennedy 2018;. However, despite the fact that models for these physiological processes and their impact on fMRI are well established [Glover et al. 2000;Birn et al. 2006;2008;Murphy et al. 2013], much recent work has focused on improved data-driven methods for artefact removal Aquino et al. 2020]. This is likely for two reasons. Firstly, even though physiologically-derived confounds have the advantage that they have a much greater a priori validity if one is concerned about mistakenly removing neural signal, there is a cost: they require extra data to be collected, inspected, and analysed Power et al. 2020]. Secondly, as Power et al. [2020] recently demonstrated, these fMRI artefacts typically arise in the context of unusual breathing events-very deep breaths, apnoeas, etc.-that are by definition hard for algorithms designed with 'normal' tidal breathing in mind to properly detect and characterise.
What we introduce here is a method, inspired by work from the electrophysiological literature, that seeks to address both of these concerns in the context of preprocessing recordings from respiratory bellows. We introduce a new estimator for respiratory volume per unit time (RVT) that does not require peak detection-a process that often requires post hoc manual intervention-and demonstrate that this accurately captures atypical breathing events. Empirically, when used as a confound regressor, this also removes more variance from fMRI data compared to a peak-based RVT estimator. Finally, for a much more detailed discussion of both breathing and breathing-related issues pertaining to fMRI than space affords in this technical note, we refer the reader to the excellent overview in Power et al. [ibid.].

Respiratory models and fMRI denoising
In the context of removing global fMRI confounds, the aim of the most commonly utilised breathing-related metric, RVT [Birn et al. 2006], is to collapse simple measurements of breathing-typically from peripheral physiological recordings, such as respiratory bellows-down to a metric that will correlate over time with the pressures of blood gases (e.g. pCO 2 and pO 2 ) 1 . There are two components to this: an increase in breathing rate or an increase in breathing depth will tend to, 1. Introduction bioRxiv for example, decrease pCO 2 . As such, RVT is simply the product of respiratory volume (RV) and breathing rate.
The other component to these fMRI-based models is the mapping from RVT to the blood-oxygen-level-dependent (BOLD) contrast. Any changes in the pressures of blood gases impact on key constituents of the BOLD contrast, such as blood volume (via vasodilation) [Gauthier and Fan 2019]. To account for these physiological processes and the inherent delays in the circulatory system, a respiratory response function (RRF) [Birn et al. 2008] is used to map from RVT to the BOLD signal. This is directly analogous to the haemodynamic response function which describes the mapping from neural activity to BOLD, and similar physiological models exist for changes in heart rate ].

RVT estimators
As we describe in more detail later, current methods typically characterise the recordings from respiratory bellows via peak detection, but this has issues in terms of both robustness and temporal resolution. Here, we use a decomposition based on the Hilbert-transform to characterise the respiratory recordings; this is a technique that has been more widely used in the context of characterising complex oscillatory waveforms from electrophysiological data [Brookes et al. 2011;Hipp et al. 2012;Luckhoo et al. 2012;Engel et al. 2013;Voytek et al. 2013;Cole and Voytek 2017;Nguyen et al. 2019]. The crucial link is that the quantities that we need to define RVT, breathing depth and rate, are simply the amplitude and (instantaneous) frequency of the breathing-related oscillation in the bellows recordings.
By way of contrast, RVT is typically estimated via peak detection, as described by Birn et al. [2006]. Firstly, one detects points of maximum inhalation and exhalation (i.e. peaks and troughs in the respiratory waveform). Given these, respiratory volume is defined as the difference in belt positions between peaks and troughs, and breathing rate is the reciprocal of the time between successive peaks (with the relevant quantities linearly interpolated as necessary). However, peak detection is not robust to noisy recordings and fundamentally constrains the temporal resolution of the model to the observed breath durations. Taken together, these two issues are why RVT-based measures can often 'miss' deep breaths [Power et al. 2020].
As such, alternatives to RVT with better estimability have been proposed. For example,  define a simple alternative to peak-based methods by simply taking the standard deviation of the respiratory signal in a small window around the timepoint of interest. While this has some of the purported benefits of our method, namely that it 'does not rely on the accuracy of peak detection required for breath-to-breath computations', it does not directly capture the effects of changes in breathing rate. Similarly,  propose what is essentially a hybrid of our approach and the above, by calculating the standard deviation of the respiratory envelope in a small window. Again, this is a peak-free method, but does not permit a clean dissociation of breathing depth and rate.
While the above two measures are more robust, they lack the direct physiological interpretability with regard to blood gas pressures that RVT offers. Therefore, our aim is to estimate RVT itself, and all comparisons in this technical note are with a peak-based RVT estimator. We use the implementation in the PhysIO toolbox [Kasper et al. 2017], and this is publicly available as part of the TAPAS package (translationalneuromodeling.org/tapas).

Methodology 2.1 Properties of the Hilbert transform and analytic signal
The Hilbert transform is central to our method as it allows us to derive an alternative representation of the respiratory recordings. For a given real-valued signal, ( ), we can define a unique analytic signal 2 , ( ), in terms of the Hilbert transform ℋ [Gabor 1946]: Where the second line is simply a rearrangement of the first into polar coordinates. Much more detail on the properties of this approach can be found in, for example, Boashash [1992a] and Huang et al. [2009]. The important aspect of this representation for the current work is the way we have split our signal into two components: an instantaneous magnitude, ( ), and an instantaneous phase, ( ). We can do one final rearrangement to make this more explicit: We plot this decomposition of a signal recorded from respiratory bellows in Figure 1. Note that we use the term amplitude envelope interchangeably with magnitude, as this is the common description in the electrophysiological literature.
Finally, these two terms are exactly what we need to calculate RVT. The respiratory volume (i.e. the difference between successive peaks and troughs) is simply twice the signal amplitude, ( ). Similarly, the breathing rate is the instantaneous frequency, ( ), which is the temporal derivative of the instantaneous phase: As a last step, we apply a low pass filter to these quantities to remove within-cycle fluctuations. The breathing rhythm is not sinusoidal, so the above terms will contain some high-frequency content that represents within-breath modulations of the waveform shape. Here, as in a typical preprocessing pipeline, RVT is convolved with the RRF-which is itself an aggressive low-pass filter-so this is not strictly necessary for fMRI denoising, but we do so because it affords more intuitive visualisations of the respiratory decomposition 3 .

Practical considerations and algorithmic approach
While the Hilbert transform is theoretically well suited to the problem at hand, deriving the necessary quantities can present a formidable practical challenge [Boashash 1992b;Huang et al. 1998]. Noise-or, more specifically, multiple local extrema that are not part of zero-centred oscillations-can cause the instantaneous frequency to become negative [Huang et al. 2009]. Correcting for these effects is a core part of the empirical mode decomposition (EMD) [Huang et al. 1998], whereby a single signal is decomposed into multiple modes, each of which is a monocomponent signal that admits a 'well behaved' decomposition into amplitude and phase terms. However, this is a data-driven approach that does not guarantee that the main breathing rhythm will be represented by a single mode. As such, while we could take an EMD-based approach here, we do something simpler. Rather than decomposing the signal and then selecting the breathingrelated modes post hoc, we assume that we can make an approximately monocomponent signal by appropriately filtering the breathing signal. In practice, this requires a few rounds of iterative refinement, and the overall algorithm is detailed below. To aid the interpretation of the filtering operations below, note that the 'normal' breathing rate in adults-from both subjects resting in a supine position and undergoing MRI scanning-is approximately 0.25 Hz [Tobin et al. 1983;].
Our proposed algorithm for RVT estimation based on the Hilbert-transform consists of the following general steps: 1 | PhysIO preprocessing: Remove low frequency drifts (less than 0.01 Hz) from the breathing signal, and remove high-frequency noise above 2.0 Hz. This version of the breathing signal is used for both RETRO-ICOR and the peak-based RVT estimate in the physiological pipelines used in the Results section. This is the raw trace shown in Panel (a) of Figure 3.
2 | Lowpass filter the data again to more aggressively remove high-frequency noise above 0.75 Hz. This is the raw trace shown in Figure 1.

| Decompose the signal into magnitude and phase components via the Hilbert transform.
4 | Linearly interpolate any periods where the phase timecourse decreases, using the procedure in Figure 2, to remove any artefactual negative frequencies.
Reconstruct the oscillatory portion of the signal, cos( ( )), and lowpass filter at 0.75 Hz to remove any resulting artefacts. This procedure is repeated 10 times, with the new phase timecourse reestimated from the filtered oscillatory signal.
5 | Calculate RV and low-pass filter the signal magnitude at 0.2 Hz to remove within-cycle fluctuations. Instantaneous breathing frequency is the numerical derivative of the phase timecourse, filtered as above.
Finally, estimates are thresholded to remove physiologically implausible values.
All filters, except for the common preprocessing, are 10 th -order infinite impulse response, with the cut-offs stated in terms of the half-power frequency. Filters are run forwards and backwards to ensure zero-phase responses, and use 10 s of circular padding (i.e. designfilt, padarray, and filtfilt in MATLAB). The filters used during the preprocessing are 20 th -order and use 100 s of padding due to the low cut-off frequency of the filter used to remove drifts in the signal.

Data overview and preprocessing pipeline
Here, we demonstrate the performance of our method on data from a pharmacological fMRI study [Iglesias et al. 2020]. Briefly, we analyse data from 69 subjects who were scanned while performing an audio-visual associative learning task after receiving a minimal dose of either a dopaminergic antagonist (amisulpride), a cholinergic antagonist (biperiden), or placebo. Full details of the fMRI acquisition parameters, preprocessing pipeline, and task analyses can be found in the Supplementary Material.
We use this dataset to compare the performance of our RVT estimator to the peak-based approach implemented in PhysIO. As well as a set of qualitative comparisons, we run a group-level paired t-test to formally assess which method results in the stronger removal of respiratory-related variance.

Qualitative behaviour
In Figure 3 we compare the quantities inferred from the raw respiratory belt recordings. We focus on a single deep breath (i.e. a sigh [Li and Yackle 2017]), as this is the type of breath that Power et al. [2020] noted that RVT can miss.
In Panel (a) we show the detected maxima and minima, which amounts to the complete description of the data for peak-based methods (the equivalent Hilbert-based decomposition is shown in Figure 1). As detailed in Kasper et al. [2017], PhysIO uses a sophisticated peak detection algorithm that includes some assumptions about the regularity with which successive peaks appear in time. However, in the case of this particular complex waveform caused by the sigh, this results in local maxima and minima being incorrectly identified. This could be corrected post hoc-though this can be a time intensive procedure-or the assumption of regularity could be relaxed. However, as discussed in Power et al. [2020], there is unlikely to be an optimal setting for peak detection algorithms. In this instance, one would manually mark a maximum at 528 s, a trough at 532 s, and the next maximum would be at 543 s; however, this is an incredibly sparse representation of 15 s of data and one which would conflate the increase in respiratory volume with the decrease in respiratory rate. Counter-intuitively therefore, the RVT method used here is more similar to an RV measure (i.e. the changes in breathing rate are small). Finally, note that while empirical results suggest that avoiding rate and depth changes cancelling in this manner may well improve denoising by reducing the number of 'missed' deep breaths, one would rather have an RVT estimator that was sensitive to these two effects separately [ibid.].
Finally, in Panels (b), (c), and (d) we compare the inferred respiratory volumes, breathing rates, and RVT itself. What is clear is that our method correctly identifies the changes in volume and rate, and furthermore, because we have a measure that is well defined for all points in time, these are not temporally overlapping. This leads to the expected changes in RVT, namely a rise driven by an increase in breathing volume, followed by a drop to well below baseline driven by the reduced breathing rate.
The Supplementary Material contains one extra visualisation of the data from this subject: in Supplementary Figure 2 we compare the inferred RVT regressors Figure 3: Comparison between the predictions from our Hilbert-based decomposition and a method based on peak detection. The portion of the respiratory trace is the same as in the lower panel of Figure 1, and includes a large inhalation at 530 s, followed by an apnoea lasting approximately 15 s. In (a) we show the detected peaks, which are the complete set of summary statistics used to define RVT as per Birn et al. [2006]. In (b), (c), and (d) we compare the inferred respiratory volume, breath durations, and RVT respectively.

Discussion
bioRxiv with the greyplot of the fMRI data itself.

Quantitative performance
In Figure 4 we compare the main effect of the RVT-based regressors at the group level. As detailed in the Supplementary Material, we run two separate grouplevel analyses to visualise the mean effect of RVT for the two different estimators, followed by a paired t-test analysing the differences between the methods. While both approaches show the expected pattern of strong positive effects distributed throughout the grey matter [Birn et al. 2006;2008;Power et al. 2017b], the bottom panel demonstrates how our method removes significantly more respiratory-related variance from the fMRI data.

Discussion
In summary, we have demonstrated that the quantities we can derive via the Hilbert transform-instantaneous estimates of amplitude and phase-naturally map onto the quantities we need to estimate RVT. This furnishes us with a simple method that allows us to calculate a time-resolved version of RVT, rather than one defined in terms of a sparse set of peaks and troughs. We have demonstrated that this can better capture atypical breathing events, even in the context of non-sinusoidal oscillations with large and rapid changes in amplitude and phase. Finally, when convolved with the RRF as part of a typical fMRI preprocessing pipeline, our RVT estimates remove more respiratory-related variance from fMRI data than our baseline peak-detection-based method.
One of the aspects of our method we have emphasised throughout this technical note is the way it can dissociate changes in the depth and rate of breathing. However, the summary metric typically used for denoising, RVT, is the product of these two quantities. As Power et al. [2020] note, this introduces an ambiguity as 'a larger-than-normal breath transpiring over a longer-than-normal time may appear quantitatively just like a typical breath occurring over a typical time', or, in other words, RVT 'is by definition relatively insensitive to outlier breath volumes so long as they [inversely] scale with breath times'. Our results speak to this in two ways. Firstly, what our results hint at is that this is primarily an artefact of the low time-resolution implied by reducing a signal down to its peaks alone, and not RVT itself. Figure 3 clearly shows how the changes in rate and depth we infer are not temporally coincident around a large and slow breath, and therefore we still see large changes in RVT in this instance. Secondly, however, if there are distinct differences in physiological responses to depth and rate changes [Birn et al. 2008;Power et al. 2020] then our method provides both time series separately. This would allow one to derive fMRI regressors for both effects, which may improve the denoising performance for respiratory artefacts.
Finally, in so far as our approach implies a model for the data, it is embodied by Equation 2: ( ) = ( ) cos( ( )). There are two key assumptions here. Firstly  Figure 4: Comparison of the fMRI-RVT associations for our Hilbert-based RVT estimator and a method based on peak detection. In the upper two panels we show the t-stats for the main effect of the RVT regressors across the group (peak-level FWE-corrected at = 0.05). In the lower panel we show the significant differences between the two as estimated via a group-level paired t-test (cluster-level FWEcorrected at = 0.05 with cluster-forming threshold = 0.001).
that the Hilbert transform provides a physically meaningful separation of fluctuations in magnitude and phase. The Hilbert transform is convenient, but splitting one time-varying signal into two time-varying quantities is an underdetermined problem that admits different solutions [Boashash 1992a]. Similarly, the interactions between the two quantities can themselves carry meaningful information [Huang et al. 2016;Nguyen et al. 2019]. Therefore, it may be that other decompositions or regularisation strategies improve performance. Secondly, the implicit assumption is that we can remove the vast majority of noise by appropriately filtering ( ). Again, this is an oversimplification in the presence of structured artefacts caused by, for example, contractions of the abdominal musculature unrelated to breathing or belt slippages [Power et al. 2020]. A more formal model might explain the data better here.
On the other hand, the simplicity of our model allows one to apply it robustly to a wide range of data, which may render it a useful quality control measure in and of itself. For example, given the magnitude and phase timecourses one could reconstruct the signal as modelled, and then compute a time-resolved measure of goodness-of-fit to the original recording. This may prove useful if it could, for example, be used to flag 'bad' timepoints that warrant manual inspection.

Conclusion
In this technical note we have introduced a new method for estimating RVT from respiratory recordings for the purpose of removing physiological artefacts from fMRI data. We have demonstrated how this can both improve the detection of atypical breathing events and more strongly attenuate global fMRI artefacts as compared to current techniques.