Skip to content

Latest commit

 

History

History
175 lines (133 loc) · 6.84 KB

v1.3.rst

File metadata and controls

175 lines (133 loc) · 6.84 KB

sklearn

Version 1.3.0

In Development

Changed models

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

  • multiclass.OutputCodeClassifier.predict now uses a more efficient pairwise distance reduction. As a consequence, the tie-breaking strategy is different and thus the predicted labels may be different. 25196 by Guillaume Lemaitre <glemaitre>.
  • The fit_transform method of decomposition.DictionaryLearning is more efficient but may produce different results as in previous versions when transform_algorithm is not the same as fit_algorithm and the number of iterations is small. 24871 by Omar Salman <OmarManzoor>.

Changes impacting all modules

  • The get_feature_names_out method of the following classes now raises a NotFittedError if the instance is not fitted. This ensures the error is consistent in all estimators with the get_feature_names_out method.

    • feature_extraction.text.TfidfTransformer
    • kernel_approximation.AdditiveChi2Sampler
    • impute.IterativeImputer
    • impute.KNNImputer
    • impute.SimpleImputer
    • isotonic.IsotonicRegression
    • preprocessing.Binarizer
    • preprocessing.MaxAbsScaler
    • preprocessing.MinMaxScaler
    • preprocessing.Normalizer
    • preprocessing.OrdinalEncoder
    • preprocessing.PowerTransformer
    • preprocessing.QuantileTransformer
    • preprocessing.RobustScaler
    • preprocessing.StandardScaler

    The NotFittedError displays an informative message asking to fit the instance with the appropriate arguments.

    25294 by John Pangas <jpangas> and 25291, 25367 by Rahil Parikh <rprkh>.

Changelog

sklearn.base

  • A __sklearn_clone__ protocol is now available to override the default behavior of base.clone. 24568 by Thomas Fan.

sklearn.cluster

  • The sample_weight parameter in predict for cluster.KMeans.predict and cluster.MiniBatchKMeans.predict is now deprecated and will be removed in v1.5. 25251 by Gleb Levitski <glevv>.

sklearn.decomposition

  • decomposition.DictionaryLearning now accepts the parameter callback for consistency with the function decomposition.dict_learning. 24871 by Omar Salman <OmarManzoor>.

sklearn.ensemble

  • Compute a custom out-of-bag score by passing a callable to ensemble.RandomForestClassifier, ensemble.RandomForestRegressor, ensemble.ExtraTreesClassifier and ensemble.ExtraTreesRegressor. 25177 by Tim Head <betatim>.
  • ensemble.GradientBoostingClassifier now exposes out-of-bag scores via the oob_scores_ or oob_score_ attributes. 24882 by Ashwin Mathur <awinml>.

sklearn.exception

  • Added exception.InconsistentVersionWarning which is raised when a scikit-learn estimator is unpickled with a scikit-learn version that is inconsistent with the sckit-learn verion the estimator was pickled with. 25297 by Thomas Fan.

sklearn.impute

  • Added the parameter fill_value to impute.IterativeImputer. 25232 by Thijs van Weezel <ValueInvestorThijs>.

sklearn.naive_bayes

  • naive_bayes.GaussianNB does not raise anymore a ZeroDivisionError when the provided sample_weight reduces the problem to a single class in fit. 24140 by Jonathan Ohayon <Johayon> and Chiara Marmo <cmarmo>.

sklearn.pipeline

  • pipeline.FeatureUnion can now use indexing notation (e.g. feature_union["scalar"]) to access transformers by name. 25093 by Thomas Fan.
  • pipeline.FeatureUnion can now access the feature_names_in_ attribute if the X value seen during .fit has a columns attribute and all columns are strings. e.g. when X is a pandas.DataFrame 25220 by Ian Thompson <it176131>.

sklearn.preprocessing

  • Adds a feature_name_combiner parameter to preprocessing.OneHotEncoder. This specifies a custom callable to create feature names to be returned by get_feature_names_out. The callable combines input arguments (input_feature, category) to a string. 22506 by Mario Kostelac <mariokostelac>.
  • Added support for sample_weight in preprocessing.KBinsDiscretizer. This allows specifying the parameter sample_weight for each sample to be used while fitting. The option is only available when strategy is set to quantile and kmeans. 24935 by Seladus <seladus>, Guillaume Lemaitre <glemaitre>, and Dea María Léon <deamarialeon>, 25257 by Gleb Levitski <glevv>.
  • AdditiveChi2Sampler is now stateless. The sample_interval_ attribute is deprecated and will be removed in 1.5. 25190 by Vincent Maladière <Vincent-Maladiere>.

sklearn.utils

  • estimator_checks.check_transformers_unfitted_stateless has been introduced to ensure stateless transformers don't raise NotFittedError during transform with no prior call to fit or fit_transform. 25190 by Vincent Maladière <Vincent-Maladiere>.

Code and Documentation Contributors

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.2, including:

TODO: update at the time of the release.