Search
Results
-
Stitching Weight-Shared Deep Neural Networks for Efficient Multitask Inference on GPU
(2022)2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)Intelligent personal and home applications demand multiple deep neural networks (DNNs) running on resource-constrained platforms for compound inference tasks, known as multitask inference. To fit multiple DNNs into low-resource devices, emerging techniques resort to weight sharing among DNNs to reduce their storage. However, such reduction in storage fails to translate into efficient execution on common accelerators such as GPUs. Most DNN ...Conference Paper -
Adaptive Loss-Aware Quantization for Multi-Bit Networks
(2020)2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)We investigate the compression of deep neural networks by quantizing their weights and activations into multiple binary bases, known as multi-bit networks (MBNs), which accelerate the inference and reduce the storage for the deployment on low-resource mobile and embedded platforms. We propose Adaptive Loss-aware Quantization (ALQ), a new MBN quantization pipeline that is able to achieve an average bitwidth below one-bit without notable ...Conference Paper -
Pruning Meta-Trained Networks for On-Device Adaptation
(2021)Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementAdapting neural networks to unseen tasks with few training samples on resource-constrained devices benefits various Internet-of-Things applications. Such neural networks should learn the new tasks in few shots and be compact in size. Meta-learning enables few-shot learning, yet the meta-trained networks can be over-parameterised. However, naive combination of standard compression techniques like network pruning with meta-learning jeopardises ...Conference Paper -
p-Meta: Towards On-device Deep Model Adaptation
(2022)KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data MiningConference Paper -
Localised Adaptive Spatial-Temporal Graph Neural Network
(2023)KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data MiningSpatial-temporal graph models are prevailing for abstracting and modelling spatial and temporal dependencies. In this work, we ask the following question: whether and to what extent can we localise spatial-temporal graph models? We limit our scope to adaptive spatial-temporal graph neural networks (ASTGNNs), the state-of-the-art model architecture. Our approach to localisation involves sparsifying the spatial graph adjacency matrices. To ...Conference Paper