Luc Steels' Mars Explorer

Luc Steels’ Mars Explorer have very low cognitive load due to replacing complex exploration planning with random exploration

developmental luc steels’ mars explorer

My BSc Computer Science Final Year Project, supervised by Terry Payne

An implementation of Luc Steels’ Mars Explorer in robot simulator, with extensions of developmental learning support.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

implementing subsumption architecture in java
implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

implementing subsumption architecture in java
implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

implementing subsumption architecture in java
implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

developmental luc steels’ mars explorer

(Design & Plan)

First, I’ll implement Luc Steels’ Mars Explorer as is in webot, which would involve:

implementing subsumption architecture in java
implementing vehicle controller with webot API that expose abstract api for Luc Steels’ Mars Explorer
implementing individual behaviour in Luc Steels’ Mars Explorer with abstract API implemented in step 2 and subsumption behaviour interface specified in step 1
integrate individual behaviours implemented in step 3 in subsumption architecture implemented in step 1

Then, I’ll implement a naive developmental extension with the existing settings(map, sensor and motor), namely, the constructivist knowledge base, the valance function/system and the interaction observation system. After that, a new “developmental explore” layer to serve as replacement of the default “random walk” layer of Luc Steels’ Mars Explorer, which would query the constructivistically learnt knowledge base for random action. Following that would be experimental metric and trace analysis to see if any interesting behaviour would emerge.

One interesting idea is whether shortcut to a signal would change the behaviour, For example, gradient field signal enable agent to derive “go up gradient” action with gradient information; with only gradient signal and move action, it is possible to develop a composite interaction that is equivallent to “go up gradient”; however, “go up gradient” can also be given as a primitive action. What would be the differences between have it as a primitve or not?

Another interesting idea is to learn with multimodality, for example with freezing. Multimodality maybe could be achieved by freezing a part of knowledge base into a subsumption behaviour layer and insert it right above the “developmental explore layer”, as a “developed behaviour”. Although still need to figure out about the triggers (do we learn them continously? or do they freeze or gradually freeze) and reinitialization (do we delete the part of knowledge base being freezed? do we start anew with empty knowledge base?) This is like learning multiple strategies or multiple skills: when learning playing baseball, after I have learnt how to hit baseball with a bat, I leave it there, my knowledge on it does not change much, nor does it need to, as I could do it pretty well now, and I focus on catching baseball now.

This project implements Luc Steels’ Mars Explorer in

This is a case study of applying subsumption architecture + random to autonomous mobile robot system

This paper is a compilation of the following ideas:

problem settings
- mobile robot remote planet expore task
- Luc Steels' AI system evaluation criteria
- strong reasoning power distributed agent
  - late seventies - complex agent that builds models of the world, interpretation, communicate with complex mesage
  - 1980s - strictly rational agent that pland nad cooperate with logical inference
current solutions - logic-based solutions, and the dissing of them
- logic-based solution of mobile robot remote planet explore task
- logical-based solution of mobile robotremote planet explore task is unrealistic
  - high technological complexity - which may be rendered outdated as today we probably have enough computing power to equip each robot with a logical inference machine(unless they are actually so much more computatation power cosuming)
  - the vision to symbolic description task is hard. This too may be oudated as computer vision nowadays is pretty powerful, with aid of machine learning and now large language moels. However they do occupy a lot of both computing power and storgae space
  - inherent problems of Logic-based AI.
behaviour-based paradigm
Luc Steels’ Mars Explorer specification

Luc Steels’ Mars Explorer

Table of Contents

1. the specification

1.1. Random Behaviour

1.1.1. movement behavior

1.1.1.1. choose randomly a direction to move

1.1.1.2. move in that direction

1.1.2. handling behaviour

1.1.2.1. if I sense a sample and am not carrying one, I pick it up

1.1.2.2. if I sense the vehicle-platform and am carrying a sample, I drop it.

1.1.3. Obstacle avoidance

1.1.3.1. if I sense an obstacle in front, I make a random turn

1.2. Gradient field behaviour

1.2.1. definition of gradient field

1.2.2. Mode Determination

1.2.2.1. if I am in exploration mode and I sense no lower gradient than the concentration in the cell on which I am located, change to return mode

1.2.2.2. if I am in return mode and I am at vehicle-platform, change to exploration mode

1.2.2.3. if I am holding a sample, change to return mode

1.2.3. Return movement

1.2.3.1. if in return mode, chose the direction of highest gradient

1.2.4. Explore movement

1.2.4.1. if in explore mode, chose the direction of lowest gradient

1.3. self-organization throough path-attraction

1.3.1. if I carry a sample, I drop 2 crumbs

1.3.2. if I carry no sample and crumbs are detected, I pick up one crumb

1.3.3. if I carry no sampel and crumbs are detected, move towards the highest concentration of crumbs.

2. justifications

2.1. random walk theorem

Bibliography

Backlinks

developmental luc steels’ mars explorer

developmental luc steels’ mars explorer

(Design & Plan)

developmental luc steels’ mars explorer

(Design & Plan)

developmental luc steels’ mars explorer

(Design & Plan)

developmental luc steels’ mars explorer

(Design & Plan)