Article | REF: S7793 V1

Cooperation of multiple reinforcement learning algorithms

Authors: Benoît GIRARD, Mehdi KHAMASSI

Publication date: December 10, 2016

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

Overview

Français

ABSTRACT

Initially developed in the field of artificial intelligence, reinforcement learning methods are an essential component of adaptive robotic control architectures. Two main classes of algorithms have been proposed: with and without internal models of the world. The first one has heavy computational costs, but is very adaptive, while the second one is cheap but slow to converge. The combination of these algorithms within a single robotic architecture might benefit from the advantages of each one. We present here these two families of algorithms, together with the combination methods that have been proposed and tested in the neuroscience and robotics field.

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHORS

Benoît GIRARD: CNRS Research Director - Institute of Intelligent Systems and Robotics, ISIR (UMR7222, CNRS – UPMC)
Mehdi KHAMASSI: CNRS Research Associate - Institute of Intelligent Systems and Robotics, ISIR (UMR7222, CNRS – UPMC)

INTRODUCTION

Reinforcement learning methods are essential components in the development of autonomous robotic systems. They must enable these systems to learn, through trial and error and without additional intervention by their designers, the actions that must be performed, and those that must be avoided, in order to achieve their mission.

Two main classes of algorithms have historically been developed in the literature: those based on the use of an internal model of the world, and in particular of transitions between states, and those without an internal model. The former consumes a lot of computational resources (i.e. the computations required to deduce the action that seems to lead to the best consequences as predicted by the internal model), but enables us to react to changes in the environment in just a few trials, by reusing the knowledge previously learned about the structure of the environment thanks to the internal model; the latter is extremely inexpensive (no model, so no estimation of the consequences of the action), but at the cost of slow convergence of learning and very poor adaptability to change (i.e.g. hundreds of trials are required to update the values associated with actions following a change in the environment). It might therefore seem logical to seek to benefit from the complementarities of these two approaches by combining them. However, the cooperation of multiple reinforcement learning systems has, until now, been little explored in the machine learning literature.

The good properties of such an approach were initially developed in the context of the study of animal behavior. Indeed, the cohabitation of multiple learning systems, and the existence of distinct neural substrates, have been clearly demonstrated in neuroscience. Several computational models have been proposed to account for the way animals coordinate their multiple learning systems. These models are a source of inspiration for the design of robotic systems. This import has been mainly focused on navigation, but need not be limited to it. Finally, the limits of these methods, whose scientific objective is the simulation of animal behavior and not operational efficiency, are perfectly surmountable in the engineering context, by getting rid of biological constraints.

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors

+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year

From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

KEYWORDS

Reinforcement learning | ensemble methods | neuro-inspiration | neuro-robotics

CAN BE ALSO FOUND IN:

Home IT Software technologies and System architectures Cooperation of multiple reinforcement learning algorithms

This article is included in

Robotics

This offer includes:

Knowledge Base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

Practical Path

Operational and didactic, to guarantee the acquisition of transversal skills

Doc & Quiz

Interactive articles with quizzes, for constructive reading

Subscribe now!

Ongoing reading
Cooperation of multiple reinforcement learning algorithms

Reinforcement learning

Bibliography

(1) - BALLEINE (B.W.), O'DOHERTY (J.P.) - Human and rodent homologies in action control : corticostriatal determinants of goal-directed and habitual action. - Neuropsychopharmacology, 35(1), 48-69, (2010).
(2) - BELLMAN (R.E.) - Dynamic Programming....

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!

The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors

+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year

From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

Outline
Full outline