R&D Result Detail

Original Title

Reducing Memory Requirements of Convolutional Neural Networks for Inference at the Edge

English Title

Reducing Memory Requirements of Convolutional Neural Networks for Inference at the Edge

Type

Paper in proceedings (conference paper)

Original Abstract

The main focus of this paper is to use post training quantization to analyse the influence of using lower precision data types in neural networks, while avoiding the process of retraining the networks in question. The main idea is to enable usage of high accuracy neural networks in devices other than high performance servers or super computers and bring the neural network compute closer to the device collecting the data. There are two main issues with using neural networks on edge devices, the memory constraint and the computational performance. Both of these issues could be diminished if the usage of lower precision data types does not considerably reduce the accuracy of the networks in question.

English abstract

The main focus of this paper is to use post training quantization to analyse the influence of using lower precision data types in neural networks, while avoiding the process of retraining the networks in question. The main idea is to enable usage of high accuracy neural networks in devices other than high performance servers or super computers and bring the neural network compute closer to the device collecting the data. There are two main issues with using neural networks on edge devices, the memory constraint and the computational performance. Both of these issues could be diminished if the usage of lower precision data types does not considerably reduce the accuracy of the networks in question.

Keywords

deep learning; neural networks; computer vision; machine learning; parallel computing; inference optimization; inference at the edge; reduced precision computing

Key words in English

deep learning; neural networks; computer vision; machine learning; parallel computing; inference optimization; inference at the edge; reduced precision computing

Authors

BRAVENEC, T.; FRÝZA, T.

RIV year

2022

Released

20.04.2021

Publisher

Vysoké učení technické v Brně

Location

Brno

ISBN

978-0-7381-4436-8

Book

International Conference Radioelektronika 2021

Pages from

1

Pages to

6

Pages count

6

URL

https://ieeexplore.ieee.org/document/9420214

BibTex

@inproceedings{BUT171248,
  author="Tomáš {Bravenec} and Tomáš {Frýza}",
  title="Reducing Memory Requirements of Convolutional Neural Networks for Inference at the Edge",
  booktitle="International Conference Radioelektronika 2021",
  year="2021",
  pages="1--6",
  publisher="Vysoké učení technické v Brně",
  address="Brno",
  doi="10.1109/RADIOELEKTRONIKA52220.2021.9420214",
  isbn="978-0-7381-4436-8",
  url="https://ieeexplore.ieee.org/document/9420214"
}

VUT

Faculties and university institutes

Parts

Reducing Memory Requirements of Convolutional Neural Networks for Inference at the Edge