How to report Terminal Node (only the one use to predict Y) Standard Deviation in TreeBagger?

1 view (last 30 days)
Hello!
Is it possible to report terminal nodes standard deviation average in TreeBagger? The TreeBagger report the average of the individual response which is the average Y of the elements in the terminal node (individual tree predicited Y), I how like to get the average of each tree terminal node standard deviation as well how can i accomplish that?
The function [Y,stdevs]=predict(TreeBagger object, [Test data]) gives me the standard deviation of the individual trees prediction but I need the terminal node standard deviation. Can someone help?
P.S: I'm an anthropologist, not a computer scientist or engineer.
Best,
David

Accepted Answer

Ilya
Ilya on 4 Oct 2012
TreeBagger is an ensemble of trees. All trees are different because they were grown on different bootstrap replicas of the data. For example, one tree can have say 50 nodes and another one can have 60 nodes. Even if both trees have the same number of nodes, the tree structures could be very different. So what is "terminal nodes standard deviation average in TreeBagger"? You could, I suppose, take all leaves (terminal nodes) from all trees in the ensemble and compute the standard deviation over them. I have never seen anyone do this. What statistical interpretation would this number have?
If you want to obtain the standard deviation for a node in a single regression tree, you could use nodeerr method of classregtree. For example
load carsmall
X = [Acceleration Displacement Horsepower Weight];
b = TreeBagger(10,X,MPG,'method','reg')
nodeerr(b.Trees{1})
would give you node variance (standard error squared) for the 1st tree in the ensemble. This number has a clear interpretation: It is the uncertainty of prediction for an observation landing on this node of the tree.
  2 Comments
David Navega
David Navega on 4 Oct 2012
I just want the standard deviation of the terminal node use to predict Y for a given X not all the nodes. Is that possible in TreeBagger?
Ilya
Ilya on 4 Oct 2012
Edited: Ilya on 4 Oct 2012
It is possible for a single tree. TreeBagger predicts by averaging predictions from individual trees. There is one node in every tree on which a specific x lands. So if a TreeBagger has 100 trees, 100 nodes will be used to predict y for this x. You can take the standard deviation over these 100 nodes, but again - this would be a number without a meaningful statistical interpretation.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!