Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using ndarray as init for KMeans raises a ValueError #26657

Merged
merged 18 commits into from
Jun 26, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 2 additions & 3 deletions sklearn/cluster/_kmeans.py
Original file line number Diff line number Diff line change
Expand Up @@ -897,10 +897,9 @@ def _check_params_vs_input(self, X, default_n_init=None):
)
self._n_init = default_n_init
if self._n_init == "auto":
if self.init == "k-means++":
self._n_init = default_n_init
if (not _is_arraylike_not_scalar(self.init)) and self.init == "k-means++":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should as well deal with the callable. So let's do the following:

        if self._n_init == "auto":
            if isinstance(self.init, str):
                self._n_init = 1 if self.init == "k-means++" else default_n_init
            elif callable(self.init):
                self._n_init = default_n_init
            else:  # array-like
                self._n_init = 1

Copy link
Contributor Author

@bnsh bnsh Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @glemaitre ... OK, so I added the changes you mention.. (Tho I changed it a little bit from what you wrote.. I more explicitly demand that self.init be either "k-means++" or "random" if it's a string, and either a callable or array like otherwise, and it raises an Exception otherwise.) I also added the docstring changes and the test case, but I don't know what I'm expected to do with the doc/whats_new/v1.3.rst... Can you please tell me what exactly I should be doing there? (That seems to be the only thing that's not passing CI/CD checks...)

Thanks!

Copy link
Contributor Author

@bnsh bnsh Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I kind of figured out the Changelog thing. Or at least it's passing the CI/CD check now. I still don't know tho, if the text in there is any good..

self._n_init = 1
else:
self._n_init = default_n_init

if _is_arraylike_not_scalar(self.init) and self._n_init != 1:
warnings.warn(
Expand Down