5. SIMD/vector register size and memory system
We've seen that the essential difference between SIMD and vector is the number of instructions (opcodes). We have also seen that vector extensions would benefit from larger vector registers. Three examples show that larger registers lead us to reconsider the memory system.
5.1 AVX-512
The AVX-512 instructions VMOVDQA, VMOVAPS, VMOVAPD for aligned loads and stores, and the equivalent instructions for non-aligned accesses, transfer 512 bits between the SIMD registers and the L1 data cache. Intel CPUs with the AVX-512 extension have cache lines of 64 bytes (512 bits). A CPU-to-L1 cache transfer therefore transfers an entire line. This means that the use of prefetch (hardware, or hardware plus software) is essential to avoid a cache miss on each access. The L1D...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference
This article is included in
Software technologies and System architectures
This offer includes:
Knowledge Base
Updated and enriched with articles validated by our scientific committees
Services
A set of exclusive tools to complement the resources
Practical Path
Operational and didactic, to guarantee the acquisition of transversal skills
Doc & Quiz
Interactive articles with quizzes, for constructive reading
SIMD/vector register size and memory system
Bibliography
- (1) - PATERSON (D.), WATERMAN (A.) - SIMD instructions considered harmful, - ACM Sigarch, Computer Architecture To-day, Sep 18, (2017). https://www.sigarch.org/csimd-instructions-considered-harmful/
- ...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference