Article | REF: H1010 V1

Multi-pipeline superscalar processors

Authors: Daniel ETIEMBLE, François ANCEAU

Publication date: August 10, 2015, Review date: August 3, 2022

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

Overview

Français

ABSTRACT

This paper describes the main features of “multi-pipeline” also called “in-order” superscalar processors. A superscalar processor uses instruction level parallelism in a sequential code to launch the execution of multiple independent instructions at each clock cycle. The additional problems for superscalar processors are detailed including register banks, cache accesses, branch predictors and instruction fetching. Examples of in-order superscalar processors are presented from Intel Pentium to some IBM Power cores. Performance of in-order and out-of-order superscalar MIPS, IBM and ARM processors is compared for speed, power dissipation, and chip area.

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHORS

Daniel ETIEMBLE: Engineer INSA Lyon - Professor Emeritus, Université Paris Sud
François ANCEAU: Engineer INPG Grenoble Retired professor – Researcher at LIP6 (Pierre-et-Marie-Curie University)

INTRODUCTION

This article examines the main features of multi-pipeline superscalar processors, often referred to as superscalars in sequence. A superscalar processor uses the instruction parallelism existing in a sequential program to start the execution of several instructions at each clock cycle. The hardware determines which independent instructions can be started simultaneously in different pipelines, because the necessary operators are available and the operands are ready. The use of multiple execution pipelines already exists in scalar processors, which can only start a single instruction per cycle due to the differences in execution times between the majority of integer and float instructions. The problem of controlling data dependencies is therefore already dealt with in scalar processors, and this treatment is recalled. With superscalar processors, hardware problems are accentuated on many points: register bank, cache access, branch prediction, instruction acquisition. In the multi-pipeline model, the hardware gathers instructions in groups of 2 or 4, and all instructions in one group must be issued before instructions in the next group are issued. Examples of the techniques used are given with Intel's Pentium and Atom, Digital's 21064 and 21164, ARM's Cortex A8 and IBM's Power 6 core. Techniques used to overcome the limitations of strict group-by-group booting are detailed.

From a raw performance point of view, "in-order" superscalars are less efficient than "restricted data flow" superscalars, often referred to as "unordered" superscalars, which search for executable instructions in a much larger window than a group of 2 or 4 instructions. For the same manufacturer and CMOS technology, the two approaches can be compared in terms of computing time, chip area and power dissipation. The comparison is presented for two MIPS processors, two IBM cores and ARM cores. At equivalent clock frequencies, the "out-of-order" version always performs better, but the "in-order" version consumes less power, uses less chip area and generally has the best performance per watt or per GHz. In-order" superscalars are therefore a good solution for embedded applications requiring more than the performance of scalar processors, but with less surface area and lower power consumption than the highest-performance solution.

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors

+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year

From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

KEYWORDS

superscalar | multi-pipeline | instruction launching | instruction level parallelism

This article is included in

Software technologies and System architectures

This offer includes:

Knowledge Base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

Practical Path

Operational and didactic, to guarantee the acquisition of transversal skills

Doc & Quiz

Interactive articles with quizzes, for constructive reading

Subscribe now!

Ongoing reading
Multi-pipeline superscalar processors

From one to several instructions per cycle

Bibliography

(1) - YEH (T.-Y.), PATT (Y.N.) - A Comprehensive instruction fetch mechanism for a processor supporting speculative execution. - Proceedings of the 25 ^th Annual ACM/IEEE International Symposium on Computer Microarchitecture, p. 129-139, déc. 1992.
(2)...