Search
Results
-
VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction
(2023)2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)The success of the Neural Radiance Fields (NeRF) in novel view synthesis has inspired researchers to propose neural implicit scene reconstruction. However, most existing neural implicit reconstruction methods optimize perscene parameters and therefore lack generalizability to new scenes. We introduce VolRecon, a novel generalizable implicit reconstruction method with Signed Ray Distance Function (SRDF). To reconstruct the scene with fine ...Conference Paper -
ARrow: A Real-Time AR Rowing Coach
(2023)EuroVis 2023 - Short PapersRowing requires physical strength and endurance in athletes as well as a precise rowing technique. The ideal rowing stroke is based on biomechanical principles and typically takes years to master. Except for time-consuming video analysis after practice, coaches currently have no means to quantitatively analyze a rower's stroke sequence and body movement. We propose ARrow, an AR application for coaches and athletes that provides real-time ...Conference Paper -
Learning-Based Dimensionality Reduction for Computing Compact and Effective Local Feature Descriptors
(2023)2023 IEEE International Conference on Robotics and Automation (ICRA)A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks, such as image matching, image retrieval, and visual localization. State-of-the-art descriptors, from hand-crafted descriptors such as SIFT to learned ones such as HardNet, are usually high-dimensional; 128 dimensions or even more. The higher the dimensionality, the larger the memory consumption and computational ...Conference Paper -
Learning-based Relational Object Matching Across Views
(2023)2023 IEEE International Conference on Robotics and Automation (ICRA)Intelligent robots require object-level scene understanding to reason about possible tasks and interactions with the environment. Moreover, many perception tasks such as scene reconstruction, image retrieval, or place recognition can benefit from reasoning on the level of objects. While keypoint-based matching can yield strong results for finding correspondences for images with small to medium view point changes, for large view point ...Conference Paper -
Capturing and Animation of Body and Clothing from Monocular Video
(2022)SA ’22: SIGGRAPH Asia 2022 Conference PapersWhile recent work has shown progress on extracting clothed 3D human avatars from a single image, video, or a set of 3D scans, several limitations remain. Most methods use a holistic representation to jointly model the body and clothing, which means that the clothing and body cannot be separated for applications like virtual try-on. Other methods separately model the body and clothing, but they require training from a large set of 3D clothed ...Conference Paper -
NeuralMeshing: Differentiable Meshing of Implicit Neural Representations
(2022)Lecture Notes in Computer Science ~ Pattern RecognitionThe generation of triangle meshes from point clouds, i.e. meshing, is a core task in computer graphics and computer vision. Traditional techniques directly construct a surface mesh using local decision heuristics, while some recent methods based on neural implicit representations try to leverage data-driven approaches for this meshing process. However, it is challenging to define a learnable representation for triangle meshes of unknown ...Conference Paper -
Camera Pose Estimation using Implicit Distortion Models
(2022)2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Low-dimensional parametric models are the de-facto standard in computer vision for intrinsic camera calibration. These models explicitly describe the mapping between incoming viewing rays and image pixels. In this paper, we explore an alternative approach which implicitly models the lens distortion. The main idea is to replace the parametric model with a regularization term that ensures the latent distortion map varies smoothly throughout ...Conference Paper -
Context-Aware Sequence Alignment using 4D Skeletal Augmentation
(2022)2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Temporal alignment of fine-grained human actions in videos is important for numerous applications in computer vision, robotics, and mixed reality. State-of-the-art methods directly learn image-based embedding space by leveraging powerful deep convolutional neural networks. While being straightforward, their results are far from satisfactory, the aligned videos exhibit severe temporal discontinuity without additional post-processing steps. ...Conference Paper -
NICE-SLAM: Neural Implicit Scalable Encoding for SLAM
(2022)2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Neural implicit representations have recently shown encouraging results in various domains, including promising progress in simultaneous localization and mapping (SLAM). Nevertheless, existing methods produce over-smoothed scene reconstructions and have difficulty scaling up to large scenes. These limitations are mainly due to their simple fully-connected network architecture that does not incorporate local information in the observations. ...Conference Paper -
Learning to Align Sequential Actions in the Wild
(2022)2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)State-of-the-art methods for self-supervised sequential action alignment rely on deep networks that find correspondences across videos in time. They either learn frame-to-frame mapping across sequences, which does not leverage temporal information, or assume monotonic alignment between each video pair, which ignores variations in the order of actions. As such, these methods are not able to deal with common real-world scenarios that involve ...Conference Paper