Luc Steels’ Mars Explorer

Table of Contents

a robot control system specification proposed but Luc Steels in his [1].

The Specification includes:

1. the specification

1.1. Random Behaviour

1.1.1. movement behavior

1.1.1.1. choose randomly a direction to move
1.1.1.2. move in that direction

1.1.2. handling behaviour

1.1.2.1. if I sense a sample and am not carrying one, I pick it up
1.1.2.2. if I sense the vehicle-platform and am carrying a sample, I drop it.

1.1.3. Obstacle avoidance

1.1.3.1. if I sense an obstacle in front, I make a random turn

1.2. Gradient field behaviour

1.2.1. definition of gradient field

Steels defined the gradient field with a diffusion process with a difference equation, the model of which cannot seem to be found. The idea is clear: the distance of robot to the vehicle.

1.2.2. Mode Determination

1.2.2.1. if I am in exploration mode and I sense no lower gradient than the concentration in the cell on which I am located, change to return mode
1.2.2.2. if I am in return mode and I am at vehicle-platform, change to exploration mode
1.2.2.3. if I am holding a sample, change to return mode

1.2.3. Return movement

1.2.3.1. if in return mode, chose the direction of highest gradient

1.2.4. Explore movement

1.2.4.1. if in explore mode, chose the direction of lowest gradient

1.3. self-organization throough path-attraction

1.3.1. if I carry a sample, I drop 2 crumbs

1.3.2. if I carry no sample and crumbs are detected, I pick up one crumb

1.3.3. if I carry no sampel and crumbs are detected, move towards the highest concentration of crumbs.

2. justifications

2.1. random walk theorem

starting from any point in a random walk restricted to a finite space, we can reach any other point any number of times (Chung, 1974)

Bibliography

[1]
L. Steels, “Cooperation between distributed agents through self-organization,” Decentrailzed ai, pp. 175–196, 1990, Accessed: Oct. 31, 2023. [Online]. Available: https://cir.nii.ac.jp/crid/1570009749529465472

Backlinks

Luc Steels’ Mars Explorer have very low cognitive load due to replacing complex exploration planning with random exploration

developmental luc steels’ mars explorer

My BSc Computer Science Final Year Project, supervised by Terry Payne

An implementation of Luc Steels’ Mars Explorer in robot simulator, with extensions of developmental learning support.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

  1. implementing subsumption architecture in java
  2. implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
  3. implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
  4. integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

  1. implementing subsumption architecture in java
  2. implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
  3. implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
  4. integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

  1. implementing subsumption architecture in java
  2. implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
  3. implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
  4. integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

  1. implementing subsumption architecture in java
  2. implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
  3. implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
  4. integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

This project implements Luc Steels’ Mars Explorer in

This is a case study of applying subsumption architecture + random to autonomous mobile robot system

This paper is a compilation of the following ideas:

Author: Linfeng He

Created: 2024-04-03 Wed 19:36