Overview
ABSTRACT
Deep neural networks, widely used in artificial intelligence applications, have led to the development of hardware supports to accelerate their execution. After a brief review of the fundamentals of these networks, and particularly convolutional neural networks, the different operators which require acceleration are presented. The specificities allowing the use of reduced numerical precision are presented, with the corresponding data formats. The different acceleration techniques are presented: addition of instructions, development of hardware components (specialized operators to be integrated in systems-on-chip, neural processors) with examples of circuits available at ARM, Intel, Google, NVidia, Xilinx.
Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.
Read the articleAUTHOR
-
Daniel ETIEMBLE: Professor Emeritus LRI, Université Paris Saclay
INTRODUCTION
With the growing importance of artificial intelligence applications, deep neural networks are being used more and more. They have seen the development of significant hardware and software support. Major operators (Google, Microsoft, etc.) and circuit suppliers (ARM, Intel, NVidia, Xilinx), as well as a host of smaller companies and startups, are offering hardware solutions to accelerate the execution of applications using deep neural networks. The aim of this article is to explain the characteristics of these hardware solutions in relation to the main features of neural networks.
Without claiming to be a theoretical or exhaustive presentation, the basic principles of neural networks (NN) are recalled: structure of a NN, structure of a neuron, activation function as well as the two phases of use of a NN (Learning and Inference). Neural networks are used at several levels: data centers, servers at the edge of the network (edge devices), smartphones and components of the Internet of Things (IoT) with different performance and energy consumption constraints, leading to different hardware supports.
While 32-bit floats are the basic digital format for neural networks, performance and power constraints have led to the use of 8-bit and 16-bit integer formats and reduced float formats (F16, BF16, TF32), which are presented. The specific operators of convolutional neural networks are presented: convolution, pooling, fully connected (dense) layers.
Examples of hardware support are presented: AI instructions from the Intel instruction set for integer computing, NVidia tensor cores, neural processors from ARM (Ethos), Intel (Nirvana NPP-T), Google (TPU) and the Xilinx VC 1902 system-on-a-chip.
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference
KEYWORDS
deep neural networks | hardware operators | numerical precision | neuronal processors | acceleration devices for systems-on-chip
This article is included in
Software technologies and System architectures
This offer includes:
Knowledge Base
Updated and enriched with articles validated by our scientific committees
Services
A set of exclusive tools to complement the resources
Practical Path
Operational and didactic, to guarantee the acquisition of transversal skills
Doc & Quiz
Interactive articles with quizzes, for constructive reading
Hardware support for deep neural networks
Bibliography
- (1) - NIELSEN (M.) - Neural Network and Deep Learning, - http://neuralnetworksanddeeplearning.com/
- (2) - TensorFlow - - https://www.tensorflow.org/ ...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference