Revisions of python-scikit-learn

Ana Guerrero's avatar Ana Guerrero (anag+factory) accepted request 1180116 from Daniel Garcia's avatar Daniel Garcia (dgarcia) (revision 31)
- Update to 1.5.0 (bsc#1226185, CVE-2024-5206):
  ## Security
  * Fix feature_extraction.text.CountVectorizer and
    feature_extraction.text.TfidfVectorizer no longer store discarded
    tokens from the training set in their stop_words_ attribute. This
    attribute would hold too frequent (above max_df) but also too rare
    tokens (below min_df). This fixes a potential security issue (data
    leak) if the discarded rare tokens hold sensitive information from
    the training set without the model developer’s knowledge.
  ## Changed models
  * Efficiency The subsampling in preprocessing.QuantileTransformer is
    now more efficient for dense arrays but the fitted quantiles and
    the results of transform may be slightly different than before
    (keeping the same statistical properties). #27344 by Xuefeng Xu.
  * Enhancement decomposition.PCA, decomposition.SparsePCA and
    decomposition.TruncatedSVD now set the sign of the components_
    attribute based on the component values instead of using the
    transformed data as reference. This change is needed to be able to
    offer consistent component signs across all PCA solvers, including
    the new svd_solver="covariance_eigh" option introduced in this
    release.
  ## Changes impacting many modules
  * Fix Raise ValueError with an informative error message when
    passing 1D sparse arrays to methods that expect 2D sparse inputs.
    #28988 by Olivier Grisel.
  * API Change The name of the input of the inverse_transform method
    of estimators has been standardized to X. As a consequence, Xt is
    deprecated and will be removed in version 1.7 in the following
    estimators: cluster.FeatureAgglomeration,
    decomposition.MiniBatchNMF, decomposition.NMF,
Ana Guerrero's avatar Ana Guerrero (anag+factory) accepted request 1169326 from Dirk Mueller's avatar Dirk Mueller (dirkmueller) (revision 29)
- update to 1.4.2:
  * This release only includes support for numpy 2.
Ana Guerrero's avatar Ana Guerrero (anag+factory) accepted request 1124107 from Dirk Mueller's avatar Dirk Mueller (dirkmueller) (revision 27)
- update to 1.3.2:
  * All dataset fetchers now accept `data_home` as any object that
    implements the :class:`os.PathLike` interface, for instance,
    :class:`pathlib.Path`.
  * Fixes a bug in :class:`decomposition.KernelPCA` by forcing the
    output of the internal :class:`preprocessing.KernelCenterer` to
    be a default array. When the arpack solver is used, it expects
    an array with a `dtype` attribute.
  * Fixes a bug for metrics using `zero_division=np.nan`
    (e.g. :func:`~metrics.precision_score`) within a paralell loop
    (e.g. :func:`~model_selection.cross_val_score`) where the
    singleton for `np.nan` will be different in the sub-processes.
  * Do not leak data via non-initialized memory in decision tree
    pickle files and make the generation of those files
    deterministic.
  * Ridge models with `solver='sparse_cg'` may have slightly
    different results with scipy>=1.12, because of an underlying
    change in the scipy solver
  * The `set_output` API correctly works with list input.
  * :class:`calibration.CalibratedClassifierCV` can now handle
    models that produce large prediction scores.

- Skip another recalcitrant test on 32 bit.
  * We are in the process of introducing a new way to route metadata
    such as sample_weight throughout the codebase, which would
    affect how meta-estimators such as pipeline.Pipeline and
  * Originally hosted in the scikit-learn-contrib repository,
  * A new category encoding strategy preprocessing.TargetEncoder
    encodes the categories based on a shrunk estimate of the average
  * The classes tree.DecisionTreeClassifier and tree.DecisionTreeRegressor
Dominique Leuenberger's avatar Dominique Leuenberger (dimstar_suse) accepted request 1103058 from Steve Kowalik's avatar Steve Kowalik (StevenK) (revision 26)
- Skip another recalcitrant test on 32 bit.
Dominique Leuenberger's avatar Dominique Leuenberger (dimstar_suse) accepted request 950579 from Steve Kowalik's avatar Steve Kowalik (StevenK) (revision 16)
- Update to 1.0.2: 
  * Fixed an infinite loop in cluster.SpectralClustering by moving an iteration counter from try to except. #21271 by Tyler Martin.
  * datasets.fetch_openml is now thread safe. Data is first downloaded to a temporary subfolder and then renamed. #21833 by Siavash Rezazadeh.
  * Fixed the constraint on the objective function of decomposition.DictionaryLearning, decomposition.MiniBatchDictionaryLearning, decomposition.SparsePCA and decomposition.MiniBatchSparsePCA to be convex and match the referenced article. #19210 by Jérémie du Boisberranger.
  * ensemble.RandomForestClassifier, ensemble.RandomForestRegressor, ensemble.ExtraTreesClassifier, ensemble.ExtraTreesRegressor, and ensemble.RandomTreesEmbedding now raise a ValueError when bootstrap=False and max_samples is not None. #21295 Haoyin Xu.
  * Solve a bug in ensemble.GradientBoostingClassifier where the exponential loss was computing the positive gradient instead of the negative one. #22050 by Guillaume Lemaitre.
  * Fixed feature_selection.SelectFromModel by improving support for base estimators that do not set feature_names_in_. #21991 by Thomas Fan.
  * Fix a bug in linear_model.RidgeClassifierCV where the method predict was performing an argmax on the scores obtained from decision_function instead of returning the multilabel indicator matrix. #19869 by Guillaume Lemaitre.
  * linear_model.LassoLarsIC now correctly computes AIC and BIC. An error is now raised when n_features > n_samples and when the noise variance is not provided. #21481 by Guillaume Lemaitre and Andrés Babino.
  * Fixed an unnecessary error when fitting manifold.Isomap with a precomputed dense distance matrix where the neighbors graph has multiple disconnected components. #21915 by Tom Dupre la Tour.
  * All sklearn.metrics.DistanceMetric subclasses now correctly support read-only buffer attributes. This fixes a regression introduced in 1.0.0 with respect to 0.24.2. #21694 by Julien Jerphanion.
  * neighbors.KDTree and neighbors.BallTree correctly supports read-only buffer attributes. #21845 by Thomas Fan.
  * Fixes compatibility bug with NumPy 1.22 in preprocessing.OneHotEncoder. #21517 by Thomas Fan.
  * Prevents tree.plot_tree from drawing out of the boundary of the figure. #21917 by Thomas Fan.
  * Support loading pickles of decision tree models when the pickle has been generated on a platform with a different bitness. A typical example is to train and pickle the model on 64 bit machine and load the model on a 32 bit machine for prediction. #21552 by Loïc Estève.
  * Non-fit methods in the following classes do not raise a UserWarning when fitted on DataFrames with valid feature names: covariance.EllipticEnvelope, ensemble.IsolationForest, ensemble.AdaBoostClassifier, neighbors.KNeighborsClassifier, neighbors.KNeighborsRegressor, neighbors.RadiusNeighborsClassifier, neighbors.RadiusNeighborsRegressor. #21199 by Thomas Fan.
  * Fixed calibration.CalibratedClassifierCV to take into account sample_weight when computing the base estimator prediction when ensemble=False. #20638 by Julien Bohné.
  * Fixed a bug in calibration.CalibratedClassifierCV with method="sigmoid" that was ignoring the sample_weight when computing the the Bayesian priors. #21179 by Guillaume Lemaitre.
  * Compute y_std properly with multi-target in sklearn.gaussian_process.GaussianProcessRegressor allowing proper normalization in multi-target scene. #20761 by Patrick de C. T. R. Ferreira.
  * Fixed a bug in feature_extraction.CountVectorizer and feature_extraction.TfidfVectorizer by raising an error when ‘min_idf’ or ‘max_idf’ are floating-point numbers greater than 1. #20752 by Alek Lefebvre.
  * linear_model.LogisticRegression now raises a better error message when the solver does not support sparse matrices with int64 indices. #21093 by Tom Dupre la Tour.
  * neighbors.KNeighborsClassifier, neighbors.KNeighborsRegressor, neighbors.RadiusNeighborsClassifier, neighbors.RadiusNeighborsRegressor with metric="precomputed" raises an error for bsr and dok sparse matrices in methods: fit, kneighbors and radius_neighbors, due to handling of explicit zeros in bsr and dok sparse graph formats. #21199 by Thomas Fan.
  * pipeline.Pipeline.get_feature_names_out correctly passes feature names out from one step of a pipeline to the next. #21351 by Thomas Fan.
  * svm.SVC and svm.SVR check for an inconsistency in its internal representation and raise an error instead of segfaulting. This fix also resolves CVE-2020-28975. #21336 by Thomas Fan.
  * manifold.TSNE now avoids numerical underflow issues during affinity matrix computation.
  * manifold.Isomap now connects disconnected components of the neighbors graph along some minimum distance pairs, instead of changing every infinite distances to zero.
  * Many others, see full changelog at https://scikit-learn.org/dev/whats_new/v1.0.html
Dominique Leuenberger's avatar Dominique Leuenberger (dimstar_suse) accepted request 897859 from Dirk Mueller's avatar Dirk Mueller (dirkmueller) (revision 15)
- update to 0.24.2:
  * a lot of bugfixes see https://scikit-learn.org/stable/whats_new/v0.24.html
- drop scikit-learn-pr19101-npfloat.patch: upstream
Displaying revisions 1 - 20 of 31
openSUSE Build Service is sponsored by