Search

Show Advanced FiltersHide Advanced Filters

Use the advanced filters to refine the search results.

Results

Now showing items 1-5 of 5

Stitching Weight-Shared Deep Neural Networks for Efficient Multitask Inference on GPU

Wang, Zeyu; He, Xiaoxi; Zhou, Zimu; et al. (2022)

2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)

Intelligent personal and home applications demand multiple deep neural networks (DNNs) running on resource-constrained platforms for compound inference tasks, known as multitask inference. To fit multiple DNNs into low-resource devices, emerging techniques resort to weight sharing among DNNs to reduce their storage. However, such reduction in storage fails to translate into efficient execution on common accelerators such as GPUs. Most DNN ...

Conference Paper
Adaptive Loss-Aware Quantization for Multi-Bit Networks

Qu, Zhongnan; Zhou, Zimu; Cheng, Yun; et al. (2020)

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

We investigate the compression of deep neural networks by quantizing their weights and activations into multiple binary bases, known as multi-bit networks (MBNs), which accelerate the inference and reduce the storage for the deployment on low-resource mobile and embedded platforms. We propose Adaptive Loss-aware Quantization (ALQ), a new MBN quantization pipeline that is able to achieve an average bitwidth below one-bit without notable ...

Conference Paper
Pruning Meta-Trained Networks for On-Device Adaptation

Gao, Dawei; He, Xiaoxi; Zhou, Zimu; et al. (2021)

Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Adapting neural networks to unseen tasks with few training samples on resource-constrained devices benefits various Internet-of-Things applications. Such neural networks should learn the new tasks in few shots and be compact in size. Meta-learning enables few-shot learning, yet the meta-trained networks can be over-parameterised. However, naive combination of standard compression techniques like network pruning with meta-learning jeopardises ...

Conference Paper
p-Meta: Towards On-device Deep Model Adaptation

Qu, Zhongnan; Zhou, Zimu; Tong, Yongxin; et al. (2022)

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Conference Paper
Localised Adaptive Spatial-Temporal Graph Neural Network

Duan, Wenying; He, Xiaoxi; Zhou, Zimu; et al. (2023)

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Spatial-temporal graph models are prevailing for abstracting and modelling spatial and temporal dependencies. In this work, we ask the following question: whether and to what extent can we localise spatial-temporal graph models? We limit our scope to adaptive spatial-temporal graph neural networks (ASTGNNs), the state-of-the-art model architecture. Our approach to localisation involves sparsifying the spatial graph adjacency matrices. To ...

Conference Paper

Research Collection

Search

Results

Stitching Weight-Shared Deep Neural Networks for Efficient Multitask Inference on GPU ﻿

Adaptive Loss-Aware Quantization for Multi-Bit Networks ﻿

Pruning Meta-Trained Networks for On-Device Adaptation ﻿

p-Meta: Towards On-device Deep Model Adaptation ﻿

Localised Adaptive Spatial-Temporal Graph Neural Network ﻿

Refine by

Stitching Weight-Shared Deep Neural Networks for Efficient Multitask Inference on GPU

Adaptive Loss-Aware Quantization for Multi-Bit Networks

Pruning Meta-Trained Networks for On-Device Adaptation

p-Meta: Towards On-device Deep Model Adaptation

Localised Adaptive Spatial-Temporal Graph Neural Network