Overview
ABSTRACT
This article describes the main characteristics of the SIMD/multimedia extensions that feature in the instruction sets of modern microprocessors. SSE-SSE4, AVX and AVX2 extensions to IA-32/Intel 64 ISAs, Neon extensions to ARM ISA and different IBM extensions (Altivec) are used as examples. The article shows the specifics of integer arithmetic, “if_then_else” implementations and memory accesses. It shows how these extensions include natural extensions of scalar instructions together with ad hoc instructions for particular applications. It describes how to use these instructions, either by helping the compiler to "vectorize" or by using intrinsics, which are function calls corresponding to the SIMD instructions to insert into a C or C++ program.
Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.
Read the articleAUTHORS
-
Daniel ETIEMBLE: Engineer INSA Lyon - Professor Emeritus, Université Paris Sud
-
Lionel LACASSAGNE: EPITA engineer - Senior lecturer at Université Paris Sud
INTRODUCTION
This article describes the main features of the SIMD extensions to modern microprocessor instruction sets that have been introduced and developed since the 1990s. The arithmetic and logic operations performed by scalar instruction set instructions are performed on the maximum size of processor registers, 32 or 64 bits. However, programs can work on data of smaller size, such as bytes (8 bits) or 16-bit words, whether signed or unsigned. This is the case for image processing, signal processing and many other applications. The principle of SIMD instructions is therefore to use larger registers (128, 256 or 512 bits) and to perform the same operation on vectors containing several elements 8 bits, 16 or 32 bits for integers, 32 or 64 bits for floating-point numbers.
The characteristics of these extensions are illustrated from the most widely used: SSE to SSE4.2, AVX and AVX2 for Intel's IA-32 and Intel 64 instruction sets, ARM's Neon and Neon2, Altivec and its various IBM variants.
Floating arithmetic instructions pose no problems. The problem of carry for integer arithmetic instructions is discussed in detail. As SIMD instructions perform the same operation on all elements of a vector, if-then-else conditional structures require special handling. Memory instructions must access elements located at successive memory addresses, which implies special processing when this is not the case. The classic example is data stored in memory as array structures (AoS), which must be transformed into array structures (SoA) to enable SIMD calculations.
Most SIMD instructions are natural extensions of scalar instruction sets, accompanied by data manipulation instructions to facilitate SIMD processing. These instructions are accompanied by ad hoc instructions for specific applications. SIMD extensions have also been called multimedia extensions, as they were originally intended to make general-purpose processors competitive in multimedia, signal processing and security applications. Typical examples of ad hoc instructions include motion detection, complex number calculations, cryptography, etc.
This article also explains how to use these instructions. One possibility is to help the compiler to "vectorize", i.e. to use them. The other approach is to use intrinsics to be inserted into C or C++ code: these are function calls to SIMD instructions to be used in particular for integer arithmetic or when high-level transformations are required that are not within the compiler "s reach.
Recent 512-bit extensions remove some of the limitations of SIMD extensions by allowing partial processing of vector elements according to a mask, and by enabling memory accesses at non-consecutive addresses. These developments bring...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference
KEYWORDS
SIMD instructions | SSE | AVX | Neon | Alvitec | vectorisation | intrinsèque
This article is included in
Software technologies and System architectures
This offer includes:
Knowledge Base
Updated and enriched with articles validated by our scientific committees
Services
A set of exclusive tools to complement the resources
Practical Path
Operational and didactic, to guarantee the acquisition of transversal skills
Doc & Quiz
Interactive articles with quizzes, for constructive reading
SIMD instruction set extensions
Bibliography
- (1) - INTEL - Intel® 64 and IA-32 architectures software developer manuals. - http://www.intel.com/content/www/us/en/ processors/architectures-software- developer-manuals.html (page consultée le 4...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference