Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX make sure that KernelPCA works with pandas output and arpack solver #27583

Merged
merged 2 commits into from Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions doc/whats_new/v1.4.rst
Expand Up @@ -253,6 +253,11 @@ Changelog
:pr:`26315` and :pr:`27098` by :user:`Mateusz Sokół <mtsokol>`,
:user:`Olivier Grisel <ogrisel>` and :user:`Edoardo Abati <EdAbati>`.

- |Fix| Fixes a bug in :class:`decomposition.KernelPCA` by forcing the output of
the internal :class:`preprocessing.KernelCenterer` to be a default array. When the
arpack solver was used, it would expect an array with a `dtype` attribute.
:pr:`27583` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.ensemble`
.......................

Expand Down
2 changes: 1 addition & 1 deletion sklearn/decomposition/_kernel_pca.py
Expand Up @@ -432,7 +432,7 @@ def fit(self, X, y=None):
raise ValueError("Cannot fit_inverse_transform with a precomputed kernel.")
X = self._validate_data(X, accept_sparse="csr", copy=self.copy_X)
self.gamma_ = 1 / X.shape[1] if self.gamma is None else self.gamma
self._centerer = KernelCenterer()
self._centerer = KernelCenterer().set_output(transform="default")
K = self._get_kernel(X)
self._fit_transform(K)

Expand Down
15 changes: 14 additions & 1 deletion sklearn/decomposition/tests/test_kernel_pca.py
Expand Up @@ -3,7 +3,8 @@
import numpy as np
import pytest

from sklearn.datasets import make_blobs, make_circles
import sklearn
from sklearn.datasets import load_iris, make_blobs, make_circles
from sklearn.decomposition import PCA, KernelPCA
from sklearn.exceptions import NotFittedError
from sklearn.linear_model import Perceptron
Expand Down Expand Up @@ -551,3 +552,15 @@ def test_kernel_pca_inverse_correct_gamma():
X2_recon = kpca2.inverse_transform(kpca1.transform(X))

assert_allclose(X1_recon, X2_recon)


def test_kernel_pca_pandas_output():
"""Check that KernelPCA works with pandas output when the solver is arpack.

Non-regression test for:
https://github.com/scikit-learn/scikit-learn/issues/27579
"""
pytest.importorskip("pandas")
X, _ = load_iris(as_frame=True, return_X_y=True)
with sklearn.config_context(transform_output="pandas"):
KernelPCA(n_components=2, eigen_solver="arpack").fit_transform(X)