Revisions of python-scikit-learn
Ana Guerrero (anag+factory)
accepted
request 1201245
from
Factory Maintainer (factory-maintainer)
(revision 34)
Automatic submission by obs-autosubmit
Dominique Leuenberger (dimstar_suse)
accepted
request 1198177
from
Dirk Mueller (dirkmueller)
(revision 33)
- prepare for python 3.13 testin
Dominique Leuenberger (dimstar_suse)
accepted
request 1190078
from
Steve Kowalik (StevenK)
(revision 32)
- Add patch support-pytest-8.3.patch: * Fix property wrapping, uncovered by Pytest 8.3 changes.
Ana Guerrero (anag+factory)
accepted
request 1180116
from
Daniel Garcia (dgarcia)
(revision 31)
- Update to 1.5.0 (bsc#1226185, CVE-2024-5206): ## Security * Fix feature_extraction.text.CountVectorizer and feature_extraction.text.TfidfVectorizer no longer store discarded tokens from the training set in their stop_words_ attribute. This attribute would hold too frequent (above max_df) but also too rare tokens (below min_df). This fixes a potential security issue (data leak) if the discarded rare tokens hold sensitive information from the training set without the model developer’s knowledge. ## Changed models * Efficiency The subsampling in preprocessing.QuantileTransformer is now more efficient for dense arrays but the fitted quantiles and the results of transform may be slightly different than before (keeping the same statistical properties). #27344 by Xuefeng Xu. * Enhancement decomposition.PCA, decomposition.SparsePCA and decomposition.TruncatedSVD now set the sign of the components_ attribute based on the component values instead of using the transformed data as reference. This change is needed to be able to offer consistent component signs across all PCA solvers, including the new svd_solver="covariance_eigh" option introduced in this release. ## Changes impacting many modules * Fix Raise ValueError with an informative error message when passing 1D sparse arrays to methods that expect 2D sparse inputs. #28988 by Olivier Grisel. * API Change The name of the input of the inverse_transform method of estimators has been standardized to X. As a consequence, Xt is deprecated and will be removed in version 1.7 in the following estimators: cluster.FeatureAgglomeration, decomposition.MiniBatchNMF, decomposition.NMF,
Dominique Leuenberger (dimstar_suse)
accepted
request 1172097
from
Dirk Mueller (dirkmueller)
(revision 30)
Ana Guerrero (anag+factory)
accepted
request 1169326
from
Dirk Mueller (dirkmueller)
(revision 29)
- update to 1.4.2: * This release only includes support for numpy 2.
Ana Guerrero (anag+factory)
accepted
request 1149083
from
Markéta Machová (mcalabkova)
(revision 28)
Ana Guerrero (anag+factory)
accepted
request 1124107
from
Dirk Mueller (dirkmueller)
(revision 27)
- update to 1.3.2: * All dataset fetchers now accept `data_home` as any object that implements the :class:`os.PathLike` interface, for instance, :class:`pathlib.Path`. * Fixes a bug in :class:`decomposition.KernelPCA` by forcing the output of the internal :class:`preprocessing.KernelCenterer` to be a default array. When the arpack solver is used, it expects an array with a `dtype` attribute. * Fixes a bug for metrics using `zero_division=np.nan` (e.g. :func:`~metrics.precision_score`) within a paralell loop (e.g. :func:`~model_selection.cross_val_score`) where the singleton for `np.nan` will be different in the sub-processes. * Do not leak data via non-initialized memory in decision tree pickle files and make the generation of those files deterministic. * Ridge models with `solver='sparse_cg'` may have slightly different results with scipy>=1.12, because of an underlying change in the scipy solver * The `set_output` API correctly works with list input. * :class:`calibration.CalibratedClassifierCV` can now handle models that produce large prediction scores. - Skip another recalcitrant test on 32 bit. * We are in the process of introducing a new way to route metadata such as sample_weight throughout the codebase, which would affect how meta-estimators such as pipeline.Pipeline and * Originally hosted in the scikit-learn-contrib repository, * A new category encoding strategy preprocessing.TargetEncoder encodes the categories based on a shrunk estimate of the average * The classes tree.DecisionTreeClassifier and tree.DecisionTreeRegressor
Dominique Leuenberger (dimstar_suse)
accepted
request 1103058
from
Steve Kowalik (StevenK)
(revision 26)
- Skip another recalcitrant test on 32 bit.
Dominique Leuenberger (dimstar_suse)
accepted
request 1101760
from
Markéta Machová (mcalabkova)
(revision 25)
Dominique Leuenberger (dimstar_suse)
accepted
request 1092217
from
Dirk Mueller (dirkmueller)
(revision 23)
Dominique Leuenberger (dimstar_suse)
accepted
request 1030924
from
Markéta Machová (mcalabkova)
(revision 19)
Dominique Leuenberger (dimstar_suse)
accepted
request 1002744
from
Dirk Mueller (dirkmueller)
(revision 18)
Dominique Leuenberger (dimstar_suse)
accepted
request 980051
from
Dirk Mueller (dirkmueller)
(revision 17)
Dominique Leuenberger (dimstar_suse)
accepted
request 950579
from
Steve Kowalik (StevenK)
(revision 16)
- Update to 1.0.2: * Fixed an infinite loop in cluster.SpectralClustering by moving an iteration counter from try to except. #21271 by Tyler Martin. * datasets.fetch_openml is now thread safe. Data is first downloaded to a temporary subfolder and then renamed. #21833 by Siavash Rezazadeh. * Fixed the constraint on the objective function of decomposition.DictionaryLearning, decomposition.MiniBatchDictionaryLearning, decomposition.SparsePCA and decomposition.MiniBatchSparsePCA to be convex and match the referenced article. #19210 by Jérémie du Boisberranger. * ensemble.RandomForestClassifier, ensemble.RandomForestRegressor, ensemble.ExtraTreesClassifier, ensemble.ExtraTreesRegressor, and ensemble.RandomTreesEmbedding now raise a ValueError when bootstrap=False and max_samples is not None. #21295 Haoyin Xu. * Solve a bug in ensemble.GradientBoostingClassifier where the exponential loss was computing the positive gradient instead of the negative one. #22050 by Guillaume Lemaitre. * Fixed feature_selection.SelectFromModel by improving support for base estimators that do not set feature_names_in_. #21991 by Thomas Fan. * Fix a bug in linear_model.RidgeClassifierCV where the method predict was performing an argmax on the scores obtained from decision_function instead of returning the multilabel indicator matrix. #19869 by Guillaume Lemaitre. * linear_model.LassoLarsIC now correctly computes AIC and BIC. An error is now raised when n_features > n_samples and when the noise variance is not provided. #21481 by Guillaume Lemaitre and Andrés Babino. * Fixed an unnecessary error when fitting manifold.Isomap with a precomputed dense distance matrix where the neighbors graph has multiple disconnected components. #21915 by Tom Dupre la Tour. * All sklearn.metrics.DistanceMetric subclasses now correctly support read-only buffer attributes. This fixes a regression introduced in 1.0.0 with respect to 0.24.2. #21694 by Julien Jerphanion. * neighbors.KDTree and neighbors.BallTree correctly supports read-only buffer attributes. #21845 by Thomas Fan. * Fixes compatibility bug with NumPy 1.22 in preprocessing.OneHotEncoder. #21517 by Thomas Fan. * Prevents tree.plot_tree from drawing out of the boundary of the figure. #21917 by Thomas Fan. * Support loading pickles of decision tree models when the pickle has been generated on a platform with a different bitness. A typical example is to train and pickle the model on 64 bit machine and load the model on a 32 bit machine for prediction. #21552 by Loïc Estève. * Non-fit methods in the following classes do not raise a UserWarning when fitted on DataFrames with valid feature names: covariance.EllipticEnvelope, ensemble.IsolationForest, ensemble.AdaBoostClassifier, neighbors.KNeighborsClassifier, neighbors.KNeighborsRegressor, neighbors.RadiusNeighborsClassifier, neighbors.RadiusNeighborsRegressor. #21199 by Thomas Fan. * Fixed calibration.CalibratedClassifierCV to take into account sample_weight when computing the base estimator prediction when ensemble=False. #20638 by Julien Bohné. * Fixed a bug in calibration.CalibratedClassifierCV with method="sigmoid" that was ignoring the sample_weight when computing the the Bayesian priors. #21179 by Guillaume Lemaitre. * Compute y_std properly with multi-target in sklearn.gaussian_process.GaussianProcessRegressor allowing proper normalization in multi-target scene. #20761 by Patrick de C. T. R. Ferreira. * Fixed a bug in feature_extraction.CountVectorizer and feature_extraction.TfidfVectorizer by raising an error when ‘min_idf’ or ‘max_idf’ are floating-point numbers greater than 1. #20752 by Alek Lefebvre. * linear_model.LogisticRegression now raises a better error message when the solver does not support sparse matrices with int64 indices. #21093 by Tom Dupre la Tour. * neighbors.KNeighborsClassifier, neighbors.KNeighborsRegressor, neighbors.RadiusNeighborsClassifier, neighbors.RadiusNeighborsRegressor with metric="precomputed" raises an error for bsr and dok sparse matrices in methods: fit, kneighbors and radius_neighbors, due to handling of explicit zeros in bsr and dok sparse graph formats. #21199 by Thomas Fan. * pipeline.Pipeline.get_feature_names_out correctly passes feature names out from one step of a pipeline to the next. #21351 by Thomas Fan. * svm.SVC and svm.SVR check for an inconsistency in its internal representation and raise an error instead of segfaulting. This fix also resolves CVE-2020-28975. #21336 by Thomas Fan. * manifold.TSNE now avoids numerical underflow issues during affinity matrix computation. * manifold.Isomap now connects disconnected components of the neighbors graph along some minimum distance pairs, instead of changing every infinite distances to zero. * Many others, see full changelog at https://scikit-learn.org/dev/whats_new/v1.0.html
Dominique Leuenberger (dimstar_suse)
accepted
request 897859
from
Dirk Mueller (dirkmueller)
(revision 15)
- update to 0.24.2: * a lot of bugfixes see https://scikit-learn.org/stable/whats_new/v0.24.html - drop scikit-learn-pr19101-npfloat.patch: upstream
Displaying revisions 1 - 20 of 34