Article | REF: H1210 V1

Coding numbers in computers

Author: Daniel ETIEMBLE

Publication date: November 10, 2023

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

Français

11. Concluding remarks

Encoding of 8-, 16-, 32- and 64-bit integers and 32- and 64-bit floats has been implemented in all general-purpose processors for decades. The fixed-point format is used mainly in signal processors.

Over the past ten years or so, new formats have been introduced, the result of two considerations:

deep neural networks, particularly for inference, can benefit from reduced formats, enabling the surface area of arithmetic operators and power dissipation to be reduced, without any significant loss of precision. Some of these formats, such as 16-bit floats (FP16, BFP16), are implemented in the instruction sets of general-purpose processors. The TF32 format is implemented in the tensors of recent Nvidia GPUs;
many applications use specialized processors such as neural processors (Google TPU,...

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors

+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year

From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

This article is included in

Software technologies and System architectures

This offer includes:

Knowledge Base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

Practical Path

Operational and didactic, to guarantee the acquisition of transversal skills

Doc & Quiz

Interactive articles with quizzes, for constructive reading

Subscribe now!

Ongoing reading
Concluding remarks

Previous
page Format Posit

Bibliography

(1) - HARRIS (D.), OBERMAN (S.), HOROWITZ (M.) - SRT Division: Architectures, Models, and Implementations (PDF) (Technical report). Stanford University, - 9 September 1998.
(2) - - IEEE Standard for Floating-Point Arithmetic,

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors

+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year

From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

Outline
Full outline