Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT Checking function _estimator_has also raises AttributeError #28167

Merged
merged 10 commits into from Feb 13, 2024

Conversation

StefanieSenger
Copy link
Contributor

Reference Issues/PRs

Towards #28108

What does this implement/fix? Explain your changes.

This PR aims to display a more understandable error message in the case when sub-estimators don't implement a method, the meta-estimator that they are being used in DOES implement. See the issue for an example.

  • pushes the sub-estimator's AttributeError to be raised during available_if, to prevent the too generic error message from _AvailableIfDescriptor from being raised
  • I have checked for code aimed to raise the error locally in the corresponding meta-estimator, that is not used anymore: except for in OneVsRestClassifier, nothing was to be found

Copy link

github-actions bot commented Jan 18, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2d24051. Link to the linter CI: here

@glemaitre
Copy link
Member

The failure is not associated with this PR. I open #28168 trying to solve the issue.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of source code, it looks good. We need to add non-regression case for each case or if the test already exist, we need to match the class name to be sure that we have the proper error message.

I gave an example for the stacking case.

sklearn/ensemble/_stacking.py Outdated Show resolved Hide resolved
sklearn/ensemble/_stacking.py Outdated Show resolved Hide resolved
sklearn/ensemble/_stacking.py Outdated Show resolved Hide resolved
sklearn/ensemble/_stacking.py Show resolved Hide resolved
sklearn/feature_selection/_from_model.py Outdated Show resolved Hide resolved
sklearn/feature_selection/_rfe.py Show resolved Hide resolved
sklearn/semi_supervised/_self_training.py Outdated Show resolved Hide resolved
sklearn/semi_supervised/_self_training.py Show resolved Hide resolved
sklearn/multiclass.py Outdated Show resolved Hide resolved
sklearn/multiclass.py Show resolved Hide resolved
@glemaitre
Copy link
Member

I just realized that I commented in the issue and not the PR. @StefanieSenger you can have a look at the following comment regarding the unit tests: #28108 (comment)

@StefanieSenger
Copy link
Contributor Author

I've updated the tests as we have talked about, @glemaitre. They all pass now.

But there are new issues popping up in sklearn.inspection around partial_dependence, that are also unrelated, I believe.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to have an entry in the changelog since we fix the error message.

@glemaitre
Copy link
Member

Uhm I don't see the error with the partial_dependence. Where did you see the traceback @StefanieSenger?

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM. Thanks @StefanieSenger

sklearn/ensemble/tests/test_stacking.py Outdated Show resolved Hide resolved
sklearn/ensemble/tests/test_stacking.py Outdated Show resolved Hide resolved
sklearn/semi_supervised/tests/test_self_training.py Outdated Show resolved Hide resolved
StefanieSenger and others added 2 commits February 5, 2024 18:36
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@StefanieSenger
Copy link
Contributor Author

StefanieSenger commented Feb 5, 2024

I've made those little requested changes, @glemaitre. Thanks for reviewing and suggesting.

This is it for this PR.

On the unrelated test failure: Locally, when I run pytest sklearn/inspection I get 84 times: ValueError: Buffer dtype mismatch, expected 'int' but got 'long'. It all traces back to sklearn/tree/_tree.pyx and other .pyx files.

I also get this on my main branch, but not a branch where I had last pulled 4 weeks ago . My compilers are up to date with make in, this is still a valid way to re-build, correct?

I think it's connected to #27546 and somehow the re-build didn't work out for me.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits, otherwise LGTM.

sklearn/ensemble/_stacking.py Outdated Show resolved Hide resolved
Comment on lines 872 to 875
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, _ = train_test_split(
scale(X), y, stratify=y, random_state=42
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably use a smaller dataset, and avoid calling scale to make the test faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scale would avoid a ConvergenceWarning of the LogisticRegression certainly. To be checked if we remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do make_classification instead then. Would be faster. The data doesn't matter here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now put make_classifiction for data creation.

@glemaitre
Copy link
Member

I also get this on my main branch, but not a branch where I had last pulled 4 weeks ago . My compilers are up to date with make in, this is still a valid way to re-build, correct?

@StefanieSenger Indeed, the error look like the type in the Cython file changed. So a good make clean && make in could make the trick (you also try to build with meson (https://scikit-learn.org/dev/developers/advanced_installation.html#building-with-meson) It will then detect automatically if something change and make the rebuild for you.

StefanieSenger and others added 2 commits February 12, 2024 11:58
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
@StefanieSenger
Copy link
Contributor Author

StefanieSenger commented Feb 12, 2024

@glemaitre
Thank you for the guidance with the build. My build hadn't worked properly (and I didn't realise that). Now it's resolved. :)
I will try meson on another occasion.

@adrinjalali adrinjalali merged commit 9a6e6dd into scikit-learn:main Feb 13, 2024
30 checks passed
@StefanieSenger StefanieSenger deleted the available_if branch February 13, 2024 14:14
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Feb 13, 2024
…cikit-learn#28167)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
glemaitre added a commit that referenced this pull request Feb 13, 2024
…28167)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants