Self-Gated Respiratory Motion Rejection for Optoacoustic Tomography

: Respiratory motion in living organisms is known to result in image blurring and loss of resolution, chieﬂy due to the lengthy acquisition times of the corresponding image acquisition methods. Optoacoustic tomography can e ﬀ ectively eliminate in vivo motion artifacts due to its inherent capacity for collecting image data from the entire imaged region following a single nanoseconds-duration laser pulse. However, multi-frame image analysis is often essential in applications relying on spectroscopic data acquisition or for scanning-based systems. Thereby, e ﬃ cient methods to correct for image distortions due to motion are imperative. Herein, we demonstrate that e ﬃ cient motion rejection in optoacoustic tomography can readily be accomplished by frame clustering during image acquisition, thus averting excessive data acquisition and post-processing. The algorithm’s e ﬃ ciency for two-and three-dimensional imaging was validated with experimental whole-body mouse data acquired by spiral volumetric optoacoustic tomography (SVOT) and full-ring cross-sectional imaging scanners.


Introduction
Motion during signal acquisition is known to result in image blurring and can further hinder proper registration of images acquired by different modalities [1][2][3][4].Respiratory motion compensation in tomographic imaging methods is often based on a gated acquisition assisted by physiological triggers, e.g., an electrocardiogram (ECG) signal.In prospective gating, the data is acquired during a limited time window when minimal, or no motion occurs.Alternately, retrospective gating correlates between the acquired images and physiological triggers during post-processing [5].More advanced retrospective approaches are based on self-gated methods where the physiological trigger is extracted from the image data itself [6][7][8].An alternative solution consists in motion tracking of specific points and subsequent correction with rigid-body transformations [9].In some parts of the body, such as the thoracic region, non-rigid motion is further produced.Thus, more sophisticated models are generally required to estimate and correct for the effects of respiratory motion [10].
High-frame-rate imaging modalities can avoid motion if sub-pixel displacements are produced during the effective image integration time.Particularly, optoacoustic tomography (OAT) can render 2D and 3D images via excitation of an entire volume with a single laser pulse [11].This corresponds to an effective integration time in the order of the pulse duration, typically a few nanoseconds.This way, the tissue motion can be "frozen" much more efficiently than most other imaging modalities.OAT has found applicability in biological studies demanding high-frame-rate imaging, such as characterization of cardiac dynamics [12], mapping of neuronal activity [13], monitoring hemodynamic patterns in tumors [14] or visualization of freely-behaving animals [15].Moreover, real-time imaging has been paramount in the successful translation of OAT to render motion-free images acquired in a handheld mode [16].While motion correction in OAT might not be relevant for images rendered with a single laser pulse, acquisition of multiple frames is still required in many applications, e.g., for rendering volumetric data from multiple cross-sections or for extending the effective field of view (FOV) of a given imaging system [17].Multiple frames are also required for multi-spectral optoacoustic tomography (MSOT) applications, where mapping of intrinsic tissue chromophores or extrinsically administered agents is achieved via spectral or temporal unmixing [18][19][20][21].Cardiac and breathing motion could readily be captured by OAT systems running at frame rates of tens of hertz [22,23], and several approaches have been suggested to mitigate motion artefacts in applications involving multi-frame data analysis.For instance, respiratory motion gating was suggested by simultaneously capturing the animal's respiratory waveforms [24].Motion correction was alternatively performed with 3D rigid-body transformations [25] and with free-form deformation models [26].Models of body motion have also been suggested for other types of scanning-based systems [27,28].Additionally, motion suppression could be achieved by reducing the delay between consecutive pulsed light excitations [29,30], which requires dedicated laser systems.
In this work, we demonstrate that motion rejection in OAT can effectively be performed on-the-fly, before image reconstruction.The suggested approach consists in clustering a sequence of OAT frames that employs the raw time-resolved signals without involving computationally and memory extensive post-processing.This represents an important advantage over other known approaches operating in the image domain [31].

Pre-Reconstruction Motion Rejection Approach
The algorithm suggested in this work aims at motion rejection in OAT systems based on a multi-frame acquisition of time-resolved pressure signals with transducer arrays.Figure 1a schematically depicts two examples of transducer array configurations for 2D and 3D imaging, which are described in more detail in the following sections.The acquired signals are generally arranged into so-called sinograms, where every sinogram represents a single frame (Figure 1b).At a fixed transducer position, k frames (sinograms) are acquired.These frames consist of matrices with rows representing the m time-samples of each signal and columns corresponding to the n transducer elements (channels) of the array.Step 1 of the algorithm consists in rearranging the k frames of the sequence into columns of a 2D matrix containing m x n rows and k columns, which represent the entire sequence of frames at a fixed transducer position (Figure 1c).In the experiments performed, the number of frames acquired at each array position were chosen to adequately capture a complete breathing cycle.Step 2 of the algorithm consists in calculating the autocorrelation matrix of all pairs of frames (MATLAB (Mathworks Inc, Natick, USA) function 'corrcoef').An example of the calculated correlation coefficients is displayed in Figure 1d.At a fixed transducer position, time decorrelation is expected to be chiefly affected by respiratory motion.Clustering of frames is subsequently done in Step 3 by applying the second order k-means method to the correlation coefficients matrix (Figure 1e, MATLAB function 'kmeans').In Step 4, the k frames are then divided into two sets based on predetermined knowledge of the characteristic physiology of the animal under specific anesthesia.As a rule, motion frames are typically fewer than static frames.Notably, when scanning at multiple transducer positions, Steps 1-4 are to be repeated for each transducer position.As an example, Figure 1f displays a comparison of the 3D views of a reconstructed image from a single position of the spherical array, as obtained from the averaged selected-frames and from the averaged rejected-frames.

Spiral Volumetric Optoacoustic Tomography
The spiral volumetric optoacoustic tomography (SVOT) scanner is schematically depicted in Figure 1a (top).A detailed description of the system is available elsewhere [32].Briefly, a spherical ultrasound array of piezocomposite elements (Imasonics SaS, Voray, France) is mounted on motorized rotating and translating stages and scanned around the animal following a spiral trajectory.The array consists of 256 elements with a central frequency of 4 MHz and −6 dB bandwidth of ~100%, arranged in a hemispherical surface with angular coverage of 90°.The excitation light beam is guided via a fiber bundle (CeramOptec GmbH, Bonn, Germany) through a cylindrical aperture in the center of the sphere.SVOT enables imaging of the entire mouse with a nearly isotropic 3D spatial resolution in the 200 μm range [31].In the experiments, light excitation was provided with a short-pulsed laser (<10 ns duration pulses with 25 mJ per-pulse energy and up to 100 Hz pulse repetition frequency) based on an optical parametric oscillator (OPO) crystal (Innolas GmbH, Krailling, Germany).The pulse repetition frequency of the laser was set to 25 Hz and the wavelength was maintained at 800 nm, corresponding to the isosbestic point of hemoglobin.The array was scanned for 17 angular positions separated by 15° (total angular coverage in the azimuthal direction of 240°) and for 30 vertical positions separated by 2 mm (total scanning length of 58 mm, a full-body scan requires approximately 10 minutes).50 frames were captured for each position of the array, for which all signals were simultaneously digitized at 40 megasamples per second with a custom-made data acquisition system (DAQ, Falkenstein Mikrosysteme GmbH, Taufkirchen, Germany) triggered with the Q-switch output of the laser.The acquired data was eventually transmitted to a PC via Ethernet.

Spiral Volumetric Optoacoustic Tomography
The spiral volumetric optoacoustic tomography (SVOT) scanner is schematically depicted in Figure 1a (top).A detailed description of the system is available elsewhere [32].Briefly, a spherical ultrasound array of piezocomposite elements (Imasonics SaS, Voray, France) is mounted on motorized rotating and translating stages and scanned around the animal following a spiral trajectory.The array consists of 256 elements with a central frequency of 4 MHz and −6 dB bandwidth of ~100%, arranged in a hemispherical surface with angular coverage of 90 • .The excitation light beam is guided via a fiber bundle (CeramOptec GmbH, Bonn, Germany) through a cylindrical aperture in the center of the sphere.SVOT enables imaging of the entire mouse with a nearly isotropic 3D spatial resolution in the 200 µm range [31].In the experiments, light excitation was provided with a short-pulsed laser (<10 ns duration pulses with 25 mJ per-pulse energy and up to 100 Hz pulse repetition frequency) based on an optical parametric oscillator (OPO) crystal (Innolas GmbH, Krailling, Germany).The pulse repetition frequency of the laser was set to 25 Hz and the wavelength was maintained at 800 nm, corresponding to the isosbestic point of hemoglobin.The array was scanned for 17 angular positions separated by 15 • (total angular coverage in the azimuthal direction of 240 • ) and for 30 vertical positions separated by 2 mm (total scanning length of 58 mm, a full-body scan requires approximately 10 min).50 frames were captured for each position of the array, for which all signals were simultaneously digitized at 40 megasamples per second with a custom-made data acquisition system (DAQ, Falkenstein Mikrosysteme GmbH, Taufkirchen, Germany) triggered with the Q-switch output of the laser.The acquired data was eventually transmitted to a PC via Ethernet.

Cross-Sectional Optoacoustic Tomography with a Ring Array
The system layout is depicted in Figure 1a (bottom) while its detailed description is available in [33].Briefly, the ultrasound array (Imasonics SaS, Voray, France) consists of an 80 mm diameter ring having 512 ultrasound individual detection elements with 5 MHz central frequency and −6 dB bandwidth of ~80%.Each element is cylindrically focused at a distance of 38 mm to selectively capture signals from the imaged cross-section.In the experiments, light excitation was provided by a short-pulsed (<10 ns duration pulses at a wavelength of 1064 nm with ~100 mJ per-pulse energy and 15 Hz pulse repetition frequency) Nd:YAG laser (Spectra Physics, Santa Clara, California).The laser beam was guided with a fiber bundle (CeramOptec GmbH, Bonn, Germany) having 12 output arms placed around the circumference of the ring transducer with an angular separation of 60 • between the arms.Much like the SVOT system, signals detected by all the array elements were simultaneously digitized at 40 megasamples per second using custom-made DAQ (Falkenstein Mikrosysteme GmbH, Taufkirchen, Germany), triggered with the Q-switch output of the laser.The data was transmitted to a PC via Ethernet.In total, 100 frames were recorded with the array positioned at two distinct regions of the mouse.

Image Reconstruction and Processing
In both scanning systems, the acquired signals were band-pass filtered (cut-off frequencies 0.25-6 MHz for SVOT and 0.5-8 MHz for cross-sectional OAT) and deconvolved with the impulse response of the array elements before reconstruction.For SVOT, tomographic reconstructions of single volumes (15 × 15 × 15 mm 3 ) for each scanning position of the spherical array transducer were done using a 3D back-projection-based algorithm [34,35].Volumetric images reconstructed at every transducer position were stitched together to render images from a larger field of view (whole-body scale).For cross-sectional OAT, the same back-projection algorithm was modified to account for the heterogeneous distribution of the speed of sound in the mouse versus the coupling medium (water) [36].For this, an initial image was first reconstructed by considering a uniform speed of sound corresponding to the speed of sound in water (determined from the measured water temperature).The animal's surface was then manually segmented, and the reconstruction was fine-tuned by assigning a different speed of sound to the segmented tissue volume in order to optimize image quality.The processing was executed with a self-developed MATLAB code.Universal image quality index (QI) was calculated for the resulting images.QI is an objective image quality index that combines three models: loss of correlation, luminance distortion and contrast distortion-a detailed description and efficient MATLAB implementation was reported in [37].

Mouse Experiments
All in-vivo animal experiments were performed in full compliance with the institutional guidelines of the Helmholtz Center Munich and with approval by the Government District of Upper Bavaria.Hairless NOD.SCID mice (Envigo, Rossdorf, Germany) were anesthetized with isoflurane.For both imaging systems, a custom-made holder was used to vertically fix the mice in a stationary position with fore and hind paws attached to the holder during the experiments.The mice were immersed inside the water tank with the animal head being kept above water.The temperature of the water tank was maintained at 34 • C with a feedback controlled heating stick.A breathing mask with a mouth clamp was used to fix the head in an upright position and to supply anesthesia and oxygen.During measurement, the anesthesia level was kept at ~2% isoflurane.

Spiral Volumetric Optoacoustic Tomography
A whole-body (neck to hind paws) SVOT image, reconstructed using the frames selected with the proposed motion rejection approach, is displayed in Figure 2a.In these experiments, an average of 32% of frames were rejected per transducer position.The effectiveness of the motion rejection approach is demonstrated by analyzing three specific regions (dashed squares in Figure 2a).Specifically, we compare the image combining all the frames, with the one obtained by averaging the selected (static) frames as well as the image obtained by averaging the rejected (motion) frames.Note that the red square partially captures the thoracic region.It can be observed that a small vessel, clearly visible in the image rendered from the selected frames (Figure 2b, red arrow), cannot be resolved in the image reconstructed using all the frames.Furthermore, the former image features a clear motion artifact in the form of a 'double vessel' (red arrow), thus concealing the small vessel.The green square captures the region around the liver.A vertical vessel appears regular and complete in the selected-frames image (red arrow), whereas the same vessel appears disrupted in the all-frames image.Also here, the rejected-frames image discloses an artifact responsible for distorting the all-frames image.Finally, the blue square captures part of the abdomen.Clearly, small vessels are better resolved in the selected-frames image (red arrows).Notably, different structures appear to be blurred (yellow arrows) in the rejected-frames images with respect to the selected-frames images.A comparison between amplitude profiles of structures labeled by dashed yellow lines in Figure 2b further emphasizes the effectiveness of the motion rejection algorithm (Figure 2c) with the signal amplitude typically improved by 10% to 30% in the selected-frames images.Likewise, fine details appear more prominent in the selected-frames images, which is evinced by additional, fine peaks in the amplitudes profiles.that the red square partially captures the thoracic region.It can be observed that a small vessel, clearly visible in the image rendered from the selected frames (Figure 2b, red arrow), cannot be resolved in the image reconstructed using all the frames.Furthermore, the former image features a clear motion artifact in the form of a 'double vessel' (red arrow), thus concealing the small vessel.The green square captures the region around the liver.A vertical vessel appears regular and complete in the selectedframes image (red arrow), whereas the same vessel appears disrupted in the all-frames image.Also here, the rejected-frames image discloses an artifact responsible for distorting the all-frames image.
Finally, the blue square captures part of the abdomen.Clearly, small vessels are better resolved in the selected-frames image (red arrows).Notably, different structures appear to be blurred (yellow arrows) in the rejected-frames images with respect to the selected-frames images.A comparison between amplitude profiles of structures labeled by dashed yellow lines in Figure 2b further emphasizes the effectiveness of the motion rejection algorithm (Figure 2c) with the signal amplitude typically improved by 10% to 30% in the selected-frames images.Likewise, fine details appear more prominent in the selected-frames images, which is evinced by additional, fine peaks in the amplitudes profiles.

Cross-Sectional Optoacoustic Tomography
Effectiveness of the algorithm in cross-sectional OAT was tested by comparing the selected-and rejected-frames images taken from two distinct regions of the animal (Figure 3a).Between 20% and 31% of the frames were rejected in the top and bottom cross-sections, respectively.The rejected- (C) Amplitude profiles marked in b (yellow dashed lines) for images reconstructed from all the frames (dashed lines) versus selected frames (solid lines).

Cross-Sectional Optoacoustic Tomography
Effectiveness of the algorithm in cross-sectional OAT was tested by comparing the selected-and rejected-frames images taken from two distinct regions of the animal (Figure 3a).Between 20% and 31% of the frames were rejected in the top and bottom cross-sections, respectively.The rejected-frames images reveal smearing artifacts caused by a breathing motion that are evident across the entire mouse cross-section.Fine structures (red arrows) within the abdominal space appear blurred in the rejected-frames image.Moreover, some superficial structures seem to be artificially 'doubled' (yellow arrows) in the rejected-frames images.Minor differences were observed in the all-frames images (data not shown) with respect to the selected-frames images.Likewise, amplitude profiles from selected structures (dashed yellow line in Figure 3a) are increased by ~10% in the selected-frames images (Figure 3b).The calculated QI clearly reveals distortions at the boundaries of major structures, located mostly superficially (Figure 3c).from selected structures (dashed yellow line in Figure 3a) are increased by ~10% in the selectedframes images (Figure 3b).The calculated QI clearly reveals distortions at the boundaries of major structures, located mostly superficially (Figure 3c).

Discussion
The presented results demonstrate that motion rejection in OAT can effectively be accomplished prior to image reconstruction.This represents a significant advantage with respect to previously reported motion rejection approaches based on auto-correlation of a sequence of reconstructed images [31], which are afflicted with excessive memory and post-processing requirements.The suggested method was successfully validated with data acquired by two-and three-dimensional imaging systems.However, motion rejection was more effective in the case of volumetric SVOT scans.In particular, it benefited from both amplitude increase of 10% to 30% and improvement in the visibility of fine details, whereas images from the cross-sectional imaging system yielded a lower amplitude increase (~10%) and minor improvement in the visibility of structures.The reason behind the reduced performance of motion rejection in cross-sectional imaging may be ascribed to the fact that breathing-associated movements are not limited to a single plane, while in-plane motion is mainly detected in the signals.Yet, although the differences between selected-and all-frames crosssectional images were minor, it was possible to quantify them by utilizing a QI based distortion measures.Notably, such distortion artifacts affect almost exclusively the edges of large structures.In spite of the fact that standard frame averaging in cross-sectional imaging may yield qualitatively comparable results, reliable rejection of 20% to 31% motion-affected frames by the algorithm may turn crucial for quantitative analyses of high resolution data, e.g.involving spectral unmixing of fine structures.
It is also important to take into account that breathing characteristics may differ from one animal to another due to age, health, size, sex or strain.All these factors affect the resilience of the animal to

Discussion
The presented results demonstrate that motion rejection in OAT can effectively be accomplished prior to image reconstruction.This represents a significant advantage with respect to previously reported motion rejection approaches based on auto-correlation of a sequence of reconstructed images [31], which are afflicted with excessive memory and post-processing requirements.The suggested method was successfully validated with data acquired by two-and three-dimensional imaging systems.However, motion rejection was more effective in the case of volumetric SVOT scans.In particular, it benefited from both amplitude increase of 10% to 30% and improvement in the visibility of fine details, whereas images from the cross-sectional imaging system yielded a lower amplitude increase (~10%) and minor improvement in the visibility of structures.The reason behind the reduced performance of motion rejection in cross-sectional imaging may be ascribed to the fact that breathing-associated movements are not limited to a single plane, while in-plane motion is mainly detected in the signals.Yet, although the differences between selected-and all-frames cross-sectional images were minor, it was possible to quantify them by utilizing a QI based distortion measures.Notably, such distortion artifacts affect almost exclusively the edges of large structures.In spite of the fact that standard frame averaging in cross-sectional imaging may yield qualitatively comparable results, reliable rejection of 20% to 31% motion-affected frames by the algorithm may turn crucial for quantitative analyses of high resolution data, e.g., involving spectral unmixing of fine structures.
It is also important to take into account that breathing characteristics may differ from one animal to another due to age, health, size, sex or strain.All these factors affect the resilience of the animal to the experimental setup, the feasible depth of anesthesia and the overall duration of the experiment [38].It was previously reported that mice under 2% isoflurane anesthesia have an average respiratory rate of 44 ± 9 breaths/min [39], where the breathing rhythm is characterized by pauses between breaths longer than the breaths themselves.As a result, the majority of the frames are static, i.e., not affected by motion.Herein, we relied on such prior knowledge of the characteristic respiratory rate and breathing rhythm to establish a rejection criterion for the clustered motion (rejected) frames.Likewise, other criteria independent of these factors may alternatively be implemented.
In conclusion, the developed motion rejection methodology can benefit numerous optoacoustic imaging methods relying on multi-frame image analysis, such as scanning-based tomography or spectroscopic imaging systems like the MSOT.It may also find applicability in handheld clinical imaging [40,41], where motion can hinder accurate signal quantification and interpretation of longitudinal and spectroscopic data.

8 Figure 1 .
Figure 1.A schematic diagram of the steps involved in the motion rejection algorithm.(a) Two-and three-dimensional scanning systems, (top) spiral volumetric optoacoustic tomography (SVOT) based on a spherical array of transducers and (bottom) cross-sectional optoacoustic tomography based on a full-ring array of cylindrically focused transducers.(b) Sequence of frames (sinograms) acquired at a single position of the scanner.(c) Rearrangement of the data corresponding to the entire sequence into a single matrix.(d) Correlation coefficients of the autocorrelation matrix of the columns in c.(e) K-means clustering of the correlation coefficients matrix into two groups, namely, selected (static) frames and rejected (motion) frames.(f) Volumetric image of a blood vessel reconstructed with data from the selected versus the rejected-frames.

Figure 1 .
Figure 1.A schematic diagram of the steps involved in the motion rejection algorithm.(A) Two-and three-dimensional scanning systems, (top) spiral volumetric optoacoustic tomography (SVOT) based on a spherical array of transducers and (bottom) cross-sectional optoacoustic tomography based on a full-ring array of cylindrically focused transducers.(B) Sequence of frames (sinograms) acquired at a single position of the scanner.(C) Rearrangement of the data corresponding to the entire sequence into a single matrix.(D) Correlation coefficients of the autocorrelation matrix of the columns in (C).(E) K-means clustering of the correlation coefficients matrix into two groups, namely, selected (static) frames and rejected (motion) frames.(F) Volumetric image of a blood vessel reconstructed with data from the selected versus the rejected-frames.

Figure 2 .
Figure 2. Motion rejection results for spiral volumetric optoacoustic tomography (SVOT).(a) Sagittal maximal intensity projection (MIP) of a volumetric image of the mouse reconstructed with the selected-frames (scale bar -1 cm).(b) Zoom-in of three regions marked in red, green, blue, respectively, in a.Each image is reconstructed with (left) all-frames, (center) the selected (static) frames (right) the rejected (motion) frames (scale bar -1 mm).Structural differences are marked (yellow and red arrows).(c) Amplitude profiles marked in b (yellow dashed lines) for images reconstructed from all the frames (dashed lines) versus selected frames (solid lines).

Figure 2 .
Figure 2. Motion rejection results for spiral volumetric optoacoustic tomography (SVOT).(A) Sagittal maximal intensity projection (MIP) of a volumetric image of the mouse reconstructed with the selected-frames (scale bar-1 cm).(B) Zoom-in of three regions marked in red, green, blue, respectively, in (A).Each image is reconstructed with (left) all-frames, (center) the selected (static) frames (right) the rejected (motion) frames (scale bar-1 mm).Structural differences are marked (yellow and red arrows).(C) Amplitude profiles marked in b (yellow dashed lines) for images reconstructed from all the frames (dashed lines) versus selected frames (solid lines).

Figure 3 .
Figure 3. Motion rejection results for cross-sectional imaging with the ring array system.(a) Reconstructed transverse slices of a mouse for two different locations rendered by considering the selected (left) versus rejected (right) frames (scale bar -1 cm).Distorted structures are marked (red and yellow arrows).(b) Amplitude profiles (of yellow dashed lines in a) for the images rendered with all (dashed line) versus selected (solid line) frames.(c) Distortion-based QI of the difference between the selected-and all-frames images (1 = high similarity; −1 = low similarity).

Figure 3 .
Figure 3. Motion rejection results for cross-sectional imaging with the ring array system.(A) Reconstructed transverse slices of a mouse for two different locations rendered by considering the selected (left) versus rejected (right) frames (scale bar-1 cm).Distorted structures are marked (red and yellow arrows).(B) Amplitude profiles (of yellow dashed lines in (A)) for the images rendered with all (dashed line) versus selected (solid line) frames.(C) Distortion-based QI of the difference between the selected-and all-frames images (1 = high similarity; −1 = low similarity).