3

Bayesian Optimization is the classic example of meta-model driven optimization where new observations are used to train a Gaussian process that provides a clue to where to optimize next.

LEM (Learnable Evolution Models) are evolutionary models where rather than recombining observations (like GAs) a population is fit to a classifier to find out which areas are more promising (although the method itself has quite a lot of non-statistical operations on top of this).

I was looking for something simpler where the optimization is driven by a simple regression tree (sample from most promising leaf or through some bandit algorithm on the leaves). However I can't seem to find any reference on the subject. It must have been tried before.

CarrKnight
  • 133
  • 4

1 Answers1

2

An active learning approach using which combines an incrementally-learned Regression Tree with bandit-style sampling from leaf nodes to determine which instance to request a label for next is described in the “Adapting to Concept Drift in Credit Card Transaction Data Streams Using Contextual Bandits and Decision Trees” paper (disclaimer: I'm an author on the paper).

A short summary:

  • A regression tree is learned to predict targets $\hat{y}$ based on feature vectors $x$. This is done with an incremental tree learner, it can grow over time as labelled data becomes available (for instance due to active learning)
  • Every node in the regression tree also contains linear models that are trained such that we can additionally distinguish between different instances within the same leaf nodes. Training linear models in all nodes, rather than only the leaf nodes or only the root node, allows us to both generalize across the tree as well as specialize for subsets of the tree.
  • Predictions made by linear models be treated as "ground truth" labels for unlabelled instances and plugged into the Regression Tree Learner to allow the tree to grow more quickly than it could if we only used data with real labels (Semi-Supervised Learning).
  • Leaf nodes are treated as arms of a Contextual Multi-Armed Bandit (contextual because we also happen to have feature vectors for instances drawn from leaf nodes)
  • A Contextual Multi-Armed Bandit algorithm is used to select unlabelled instances for which we'd like to obtain labels (i.e. active learning)
Dennis Soemers
  • 295
  • 3
  • 11