-
-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IsotonicRegression results differ between fit/transform and fit_transform with ties in X #4184
Comments
Ah extremely sorry that I broke that :/ note to self: need NRTs after fix. |
thanks for the report! |
As a workaround, I tested using
vs.
|
At d255866
|
I get the same failure in 0.15.2 |
(that is using your regression tests, should have posted in the PR I guess) |
Hm, I had only tested 0.15.2 with the simple duplicate zero minimum value case. It goes deeper! |
Just updated PR #4185 to include comparison to R's |
Possibly related, also in sklearn 0.15.2 Given dataset X
Provides:
Although not huge differences, the two approaches to fit>transform are no longer equivalent |
@FixLaurens not really related to the issue here. I am think this outcome is fine, as the result is quite close. You can't entirely guarantee exactly similar results from numeric optimization. We could try to get closer in NMF but I'm not sure this is worth the extra computation. |
…ion re: issue scikit-learn#4184 Expanding tests to include ties at both x_min and x_max Updating unit test to include reference data against R's isotone gpava() with ties=primary Adding R and isotone package versions for reproducibility/documentation Removing double space in docstring Combining tests for fit and transform with ties; fixing spelling error
…ion re: issue scikit-learn#4184 Expanding tests to include ties at both x_min and x_max Updating unit test to include reference data against R's isotone gpava() with ties=primary Adding R and isotone package versions for reproducibility/documentation Removing double space in docstring Combining tests for fit and transform with ties; fixing spelling error
…ion re: issue scikit-learn#4184 Expanding tests to include ties at both x_min and x_max Updating unit test to include reference data against R's isotone gpava() with ties=primary Adding R and isotone package versions for reproducibility/documentation Removing double space in docstring Combining tests for fit and transform with ties; fixing spelling error
…ion re: issue scikit-learn#4184 Expanding tests to include ties at both x_min and x_max Updating unit test to include reference data against R's isotone gpava() with ties=primary Adding R and isotone package versions for reproducibility/documentation Removing double space in docstring Combining tests for fit and transform with ties; fixing spelling error
* tag '0.16b1': (1589 commits) 0.16.X branching, version 0.16b1 Fix scikit-learn#4351. Rendering of docs in MinMaxScaler. Fix rebase conflict MAINT use canonical PEP-440 dev version consistently Adding fix for issue scikit-learn#4297, isotonic infinite loop DOC deprecate random_state for DBSCAN FIX/TST boundary cases in dbscan (closes scikit-learn#4073) Do not shuffle in DBSCAN (warn if `random_state` is used). Update docstring predict_proba() Update documentation of predict_proba in tree module add scipy2013 tutorial links to presentations on website. TST boundary handling in LSHForest.radius_neighbors ENH improve docstrings and test for radius_neighbors models use a pipeline for pre-processing feature selection, as per best practise DOC remove unnecessary backticks in CONTRIBUTING. ENH no need for tie breaking jitter in calibration Implement "secondary" tie strategy in isotonic. Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184 MAINT fix typo pyagm -> pygamg in SkipTest STYLE trailing spaces ...
* releases: (1589 commits) 0.16.X branching, version 0.16b1 Fix scikit-learn#4351. Rendering of docs in MinMaxScaler. Fix rebase conflict MAINT use canonical PEP-440 dev version consistently Adding fix for issue scikit-learn#4297, isotonic infinite loop DOC deprecate random_state for DBSCAN FIX/TST boundary cases in dbscan (closes scikit-learn#4073) Do not shuffle in DBSCAN (warn if `random_state` is used). Update docstring predict_proba() Update documentation of predict_proba in tree module add scipy2013 tutorial links to presentations on website. TST boundary handling in LSHForest.radius_neighbors ENH improve docstrings and test for radius_neighbors models use a pipeline for pre-processing feature selection, as per best practise DOC remove unnecessary backticks in CONTRIBUTING. ENH no need for tie breaking jitter in calibration Implement "secondary" tie strategy in isotonic. Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184 MAINT fix typo pyagm -> pygamg in SkipTest STYLE trailing spaces ... Conflicts: sklearn/externals/joblib/__init__.py sklearn/externals/joblib/numpy_pickle.py sklearn/externals/joblib/parallel.py sklearn/externals/joblib/pool.py
* dfsg: (1589 commits) 0.16.X branching, version 0.16b1 Fix scikit-learn#4351. Rendering of docs in MinMaxScaler. Fix rebase conflict MAINT use canonical PEP-440 dev version consistently Adding fix for issue scikit-learn#4297, isotonic infinite loop DOC deprecate random_state for DBSCAN FIX/TST boundary cases in dbscan (closes scikit-learn#4073) Do not shuffle in DBSCAN (warn if `random_state` is used). Update docstring predict_proba() Update documentation of predict_proba in tree module add scipy2013 tutorial links to presentations on website. TST boundary handling in LSHForest.radius_neighbors ENH improve docstrings and test for radius_neighbors models use a pipeline for pre-processing feature selection, as per best practise DOC remove unnecessary backticks in CONTRIBUTING. ENH no need for tie breaking jitter in calibration Implement "secondary" tie strategy in isotonic. Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184 MAINT fix typo pyagm -> pygamg in SkipTest STYLE trailing spaces ...
Per conversation in issue #2507, IsotonicRegression appears to have regressed due to commit a9ea55f.
This IPython notebook demonstrates the failure on HEAD.
I tested the following two commits with the notebook:
d255866: no difference, SUCCESS
a9ea55f: difference, FAILURE
In other words, I think we can blame the switch for
interp1d
from "linear" to "slinear"; first thought is that 1-d spline "slinear" matrix formulation is ill-posed for x-ties, whereas the piecewise "linear" implementation is unaffected?Small additional note: confirmed failure with test case where x-values are all non-zero, e.g.,
[1, 1, 2, 3, 4]
instead of[0, 0, 1, 2, 3]
, sox=0
isn't part of the cause.The text was updated successfully, but these errors were encountered: