Hierarchical Behavior Learning
The topic of my thesis research is hierarchical behavior learning.
Clearly, humans solve complex problems using hierarchy, and many researchers
have worked on how to incorporate hierarchy into AI. This is work very
much in progress, so rather than present my current rather unbaked ramblings,
I'm going to list a bunch of insights that I've had that may or may not
result in something.
-
My interest in hierarchical behaviors started back when I was an undergraduate
at Stanford. I had the idea that it seemed like language could perhaps
be best modelled using a hierarchical structure, or something like an object-oriented
framework.
-
Originally, my ideas for my undergraduate thesis started out more in this
vein, but I was convinced to work on something a bit more tractable for
a 1-year project by some great advisors (Nils
Nilsson, John
Koza, and Tom Wasow).
So, I worked on automatically acquiring mental maps for simple robotics
style problems using genetic-programming. My thesis, in which I presented
my mapmaker research and outlined my young and naive grand vision, is available
here.
-
In working in the field of genetic programming, I evolved programs that
contain subroutines, and did some work on evolving the structure of program
architecture using GP. This work is discussed in the book on
which I'm a co-author (go here
for more info).
-
Once in graduate school, I became reasonably quickly convinced that a stronger
use of knowledge about programs and their effects was important for automatically
determining hierarchies and for doing long-term learning. Thus, I
turned to reinforcement learning. Since then, I've been on a reasonably
long intellectual journey trying to exactly pin down what I mean by hierarchy,
how it applies to reinforcement learning, and how to build systems that
use it.
-
One of my first ideas was that we first need to figure out what makes a
good macro-action. Based on ruminating about this for a while, and
some discussions with Sebastian Thrun and Andrew McCallum at CMU, I developed
a really simple system that evaluates potential subgoals based on a heuristic
that values macro usefullness and generality. I presented this paper
(available here)
at a NIPS workshop on hierarchical RL.
-
Over the summer of 1999, I developed a "grand vision" of how one might
want to use hierarchy. Now, I'm working out the gory details and
trying to pick a piece to work on for a thesis.
-
Some bits include using a source-path-goal schema for each action that
is an abstract model of what the action does.
-
Another bit is to extend Ron
Parr's HAMs to handle function approximation, parameters, and an object-oriented
knowledge base.
-
Some of this most recent work has been influenced by the FOPL groups work
on extending Avi Pfieffer's Object
Oriented Belief Nets into a first-order probabilistic logic, where
we do inference by using MCMC.
Back to Dave's Home Page

dandre at cs.berkeley.edu