Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GBRT quantile loss terminal region update and possible feature request #4599

Closed
jwkvam opened this issue Apr 15, 2015 · 4 comments
Closed

GBRT quantile loss terminal region update and possible feature request #4599

jwkvam opened this issue Apr 15, 2015 · 4 comments

Comments

@jwkvam
Copy link
Contributor

jwkvam commented Apr 15, 2015

I have been playing around with the quantile loss implementation in GBRT. I have had somewhat mixed results. In the plot below are the differences between the .05 and .95 quantiles and between the .2 and .8 quantiles. The jagged plots are with using the default algorithm. The smooth plots just replace the {,_}update_terminal_region{s,}() functions with the ones in the LeastSquaresError class. I'm getting better results with the new quantiles i.e. ~90% of my predictions fall within the .05-.95 quantiles and ~60% fall within the .2-.8 quantiles. With the default terminal updates, with my dataset I'm getting about 5-10% lower.

I'm wondering if it makes sense to make this an option, i.e. a boolean to just use the MSE estimates of the leaves in the tree to update the prediction values. In other words if the more sophisticated steps don't happen to work as well as the default one.

I realize this isn't probably a very strong argument yet but I wanted to see how others felt.

image

This may be related to #4210?

cc @pprett @glouppe

@ogrisel
Copy link
Member

ogrisel commented Apr 16, 2015

I am not sure I understand. Could you please put the diff of your changes in a gist.github.com or push them as a branch in your repo and make it more explicit which is which in the labels of your plot for each of the 4 curves (cyan, red, blue and green)?

@jwkvam
Copy link
Contributor Author

jwkvam commented Apr 16, 2015

Sorry for not being clearer, I don't have any code that performs the feature I'm asking. Temporarily I just made a new loss function [1] which would have the desired effect.

I'll try and explain the plot better. I had some private data for an energy use forecasting problem. I plotted (.95 - .05) percentiles and (.8 - .2) percentiles predicted for both the existing loss function and [1].

  • Red: uses the existing algorithm for the .95 - .05 difference
  • Blue: uses [1] for the .95 - .05 difference
  • Cyan: uses the existing algorithm for the .8 - .2 difference
  • Green: uses [1] for .8 - .2 difference

With the existing code I was having problems with predicted quantiles being ordered properly (i.e. 20th percentile being less than the median) and having the expected number of samples falling in the quantile ranges.

I hope that helps, thanks for your time.

[1] https://gist.github.com/jwkvam/f5e9b2697479de55f1c1

@pprett
Copy link
Member

pprett commented Apr 27, 2015

thanks for looking into this @jwkvam . It seems that the quantile loss needs some closer attention (more than i can dedicate currently).

@lorentzenchr
Copy link
Member

The loss in [1] seems to be the same as implemented in #924, but update_terminal_regions seems different, i.e. [1] sets terminal nodes as

y_pred[:, k] += learning_rate * tree.predict(X).ravel()

In ESL 2nd Ed, Algoritm 10.3 (p.361) instructs to find terminal regions in trees by a squared error criterion on the gradients, but then to set the value of terminal nodes according to the minimal loss in that region, which is the quantile in our case (and not the tree predicting the gradient).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants