GBRT quantile loss terminal region update and possible feature request #4599

jwkvam · 2015-04-15T17:25:35Z

I have been playing around with the quantile loss implementation in GBRT. I have had somewhat mixed results. In the plot below are the differences between the .05 and .95 quantiles and between the .2 and .8 quantiles. The jagged plots are with using the default algorithm. The smooth plots just replace the {,_}update_terminal_region{s,}() functions with the ones in the LeastSquaresError class. I'm getting better results with the new quantiles i.e. ~90% of my predictions fall within the .05-.95 quantiles and ~60% fall within the .2-.8 quantiles. With the default terminal updates, with my dataset I'm getting about 5-10% lower.

I'm wondering if it makes sense to make this an option, i.e. a boolean to just use the MSE estimates of the leaves in the tree to update the prediction values. In other words if the more sophisticated steps don't happen to work as well as the default one.

I realize this isn't probably a very strong argument yet but I wanted to see how others felt.

This may be related to #4210?

cc @pprett @glouppe

The text was updated successfully, but these errors were encountered:

ogrisel · 2015-04-16T14:51:50Z

I am not sure I understand. Could you please put the diff of your changes in a gist.github.com or push them as a branch in your repo and make it more explicit which is which in the labels of your plot for each of the 4 curves (cyan, red, blue and green)?

jwkvam · 2015-04-16T17:02:10Z

Sorry for not being clearer, I don't have any code that performs the feature I'm asking. Temporarily I just made a new loss function [1] which would have the desired effect.

I'll try and explain the plot better. I had some private data for an energy use forecasting problem. I plotted (.95 - .05) percentiles and (.8 - .2) percentiles predicted for both the existing loss function and [1].

Red: uses the existing algorithm for the .95 - .05 difference
Blue: uses [1] for the .95 - .05 difference
Cyan: uses the existing algorithm for the .8 - .2 difference
Green: uses [1] for .8 - .2 difference

With the existing code I was having problems with predicted quantiles being ordered properly (i.e. 20th percentile being less than the median) and having the expected number of samples falling in the quantile ranges.

I hope that helps, thanks for your time.

[1] https://gist.github.com/jwkvam/f5e9b2697479de55f1c1

pprett · 2015-04-27T13:24:41Z

thanks for looking into this @jwkvam . It seems that the quantile loss needs some closer attention (more than i can dedicate currently).

lorentzenchr · 2020-11-25T15:59:49Z

The loss in [1] seems to be the same as implemented in #924, but update_terminal_regions seems different, i.e. [1] sets terminal nodes as

y_pred[:, k] += learning_rate * tree.predict(X).ravel()

In ESL 2nd Ed, Algoritm 10.3 (p.361) instructs to find terminal regions in trees by a squared error criterion on the gradients, but then to set the value of terminal nodes according to the minimal loss in that region, which is the quantile in our case (and not the tree predicting the gradient).

lorentzenchr closed this as completed Nov 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GBRT quantile loss terminal region update and possible feature request #4599

GBRT quantile loss terminal region update and possible feature request #4599

jwkvam commented Apr 15, 2015

ogrisel commented Apr 16, 2015

jwkvam commented Apr 16, 2015

pprett commented Apr 27, 2015

lorentzenchr commented Nov 25, 2020

GBRT quantile loss terminal region update and possible feature request #4599

GBRT quantile loss terminal region update and possible feature request #4599

Comments

jwkvam commented Apr 15, 2015

ogrisel commented Apr 16, 2015

jwkvam commented Apr 16, 2015

pprett commented Apr 27, 2015

lorentzenchr commented Nov 25, 2020