Overview

Request 1180109 superseded

- Update to 1.5.0 (bsc#1226185, CVE-2024-5206):
## Security
* Fix feature_extraction.text.CountVectorizer and
feature_extraction.text.TfidfVectorizer no longer store discarded
tokens from the training set in their stop_words_ attribute. This
attribute would hold too frequent (above max_df) but also too rare
tokens (below min_df). This fixes a potential security issue (data
leak) if the discarded rare tokens hold sensitive information from
the training set without the model developer’s knowledge.
## Changed models
* Efficiency The subsampling in preprocessing.QuantileTransformer is
now more efficient for dense arrays but the fitted quantiles and
the results of transform may be slightly different than before
(keeping the same statistical properties). #27344 by Xuefeng Xu.
* Enhancement decomposition.PCA, decomposition.SparsePCA and
decomposition.TruncatedSVD now set the sign of the components_
attribute based on the component values instead of using the
transformed data as reference. This change is needed to be able to
offer consistent component signs across all PCA solvers, including
the new svd_solver="covariance_eigh" option introduced in this
release.
## Changes impacting many modules
* Fix Raise ValueError with an informative error message when
passing 1D sparse arrays to methods that expect 2D sparse inputs.
#28988 by Olivier Grisel.
* API Change The name of the input of the inverse_transform method
of estimators has been standardized to X. As a consequence, Xt is
deprecated and will be removed in version 1.7 in the following
estimators: cluster.FeatureAgglomeration,
decomposition.MiniBatchNMF, decomposition.NMF,

Request History
Daniel Garcia's avatar

dgarcia created request

- Update to 1.5.0 (bsc#1226185, CVE-2024-5206):
## Security
* Fix feature_extraction.text.CountVectorizer and
feature_extraction.text.TfidfVectorizer no longer store discarded
tokens from the training set in their stop_words_ attribute. This
attribute would hold too frequent (above max_df) but also too rare
tokens (below min_df). This fixes a potential security issue (data
leak) if the discarded rare tokens hold sensitive information from
the training set without the model developer’s knowledge.
## Changed models
* Efficiency The subsampling in preprocessing.QuantileTransformer is
now more efficient for dense arrays but the fitted quantiles and
the results of transform may be slightly different than before
(keeping the same statistical properties). #27344 by Xuefeng Xu.
* Enhancement decomposition.PCA, decomposition.SparsePCA and
decomposition.TruncatedSVD now set the sign of the components_
attribute based on the component values instead of using the
transformed data as reference. This change is needed to be able to
offer consistent component signs across all PCA solvers, including
the new svd_solver="covariance_eigh" option introduced in this
release.
## Changes impacting many modules
* Fix Raise ValueError with an informative error message when
passing 1D sparse arrays to methods that expect 2D sparse inputs.
#28988 by Olivier Grisel.
* API Change The name of the input of the inverse_transform method
of estimators has been standardized to X. As a consequence, Xt is
deprecated and will be removed in version 1.7 in the following
estimators: cluster.FeatureAgglomeration,
decomposition.MiniBatchNMF, decomposition.NMF,


Factory Auto's avatar

factory-auto added opensuse-review-team as a reviewer

Please review sources


Factory Auto's avatar

factory-auto accepted review

Check script succeeded


Ana Guerrero's avatar

anag+factory set openSUSE:Factory:Staging:E as a staging project

Being evaluated by staging project "openSUSE:Factory:Staging:E"


Ana Guerrero's avatar

anag+factory accepted review

Picked "openSUSE:Factory:Staging:E"


Daniel Garcia's avatar

dgarcia superseded request

superseded by 1180116

openSUSE Build Service is sponsored by