Article | REF: AF620 V1

Data analysis or multidimensional exploratory statistics

Authors: Philippe BESSE, Alain BACCINI

Publication date: April 10, 2011

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

Overview

Français

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHORS

Philippe BESSE: Professor at INSA Toulouse - Toulouse Institute of Mathematics
Alain BACCINI: Former professor at Paul Sabatier University (Toulouse 3) - Institut de Mathématiques de Toulouse

INTRODUCTION

Data analysis techniques, or more precisely, multi-dimensional exploratory statistics, are aimed at the descriptive study of large tables: n rows, or individuals, or statistical units, where n varies from a few tens to a few thousands, or even millions, p columns, or statistical variables, where p varies from a few tens to a few thousands. This objective is achieved by producing synthetic graphs and indicators that summarize the structures and main characteristics of these large tables. The methods proposed are therefore descriptive techniques for the study of a large number of variables and individuals; they complement elementary one- or two-dimensional statistical tools and are often a prerequisite for modeling or an inferential, decisional or predictive approach to the data studied.

The development of technological means of measurement is at the origin of ever-growing data flows, the storage and analysis of which are made possible by the joint development of computing resources. The objectives and fields of application of statistical data mining are many and varied. Let's take a look at a few examples of how this exploration can be of interest in different sectors:

in the industrial sector (agri-food, microelectronics, mechanical engineering, etc.), where process monitoring and product traceability automatically generate considerable data flows. Statistical exploration is a prerequisite for any modeling research, for example, for the implementation of statistical process control (SPC) or failure detection;
upstream, in research and development, where needs are just as great: virtual screening of molecules in the pharmaceutical industry, sensiometry in the agri-food industry, not to mention the considerable boom in post-genomic biotechnologies with transcriptomic and proteomic data... ;
in the tertiary sector (banking, insurance, mail order, telephone operators, etc.) and services, where huge customer files are searched (data mining) for marketing purposes, with the aim of personalizing customer relationship management.

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors

+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year

From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

CAN BE ALSO FOUND IN:

Home Fundamental sciences Mathematics Data analysis or multidimensional exploratory statistics

This article is included in

Software technologies and System architectures

This offer includes:

Knowledge Base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

Practical Path

Operational and didactic, to guarantee the acquisition of transversal skills

Doc & Quiz

Interactive articles with quizzes, for constructive reading

Subscribe now!

Ongoing reading
Data analysis or multidimensional exploratory statistics

Prolegomena

Bibliography

(1) - BENZECRI (J.P.) - L'analyse des données. L'analyse des correspondances - Dunod, Paris (1973).
(2) - BESSE (P.C.), CAUSSINUS (H.), FERRE (L.), FINE (J.) - Principal component analysis and optimization of graphical displays - Statistics,...

Websites

Other resources (handouts, practical exercises, functions written in R) are available on the website :

https://www.math.univ-toulouse.fr/

R Development Core TeamR: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing

Find out more

The most useful general and introductory references for this theme are: Bouroche & Saporta (1980), Jobson (1992), Lebart, Morineau & Piron (2006), Mardia, Kent & Bibby (1979), Saporta (2006). More recent additions and developments can be found in: Droesbeke, Fichet & Tassi (1992), Govaert (2003).

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors

+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year

From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects