Revisions of python-nltk
Ana Guerrero (anag+factory)
accepted
request 1218976
from
Daniel Garcia (dgarcia)
(revision 19)
- Use _service to download source and exclude documentation that has non-commercial license (boo#1232448) - Remove nltk_data to avoid redistribution of files with non-commercial (boo#1232448): > NLTK corpora are provided under the terms given in the README file > for each corpus; all are redistributable and available for > non-commercial use. - Remove not needed skip-networked-test.patch
Ana Guerrero (anag+factory)
accepted
request 1218494
from
Matej Cepl (mcepl)
(revision 18)
- Update to to 3.9.1 (changes since 3.8.1): * Fixed bug that prevented wordnet from loading * Fix security vulnerability CVE-2024-39705 (breaking change) * Replace pickled models (punkt, chunker, taggers) by new pickle-free "_tab" packages * No longer sort Wordnet synsets and relations (sort in calling function when required) * Only strip the last suffix in Wordnet Morphy, thus restricting synsets() results * Add Python 3.12 support * Many other minor fixes - Refresh nltk_data - Remome upstreamed patches: - CVE-2024-39705.patch - nltk-pr3207-py312.patch - Update to 3.8
Dominique Leuenberger (dimstar_suse)
accepted
request 1189727
from
Daniel Garcia (dgarcia)
(revision 17)
- Add CVE-2024-39705.patch upstream patch to fix unsafe pickle usage. (CVE-2024-39705, gh#nltk/nltk#3266, bsc#1227174). - Drop CVE-2024-39705-disable-download.patch as it's not needed anymore.
Ana Guerrero (anag+factory)
accepted
request 1185062
from
Matej Cepl (mcepl)
(revision 16)
- Use tarball from GitHub instead of the Zip archive from PyPI, the latter has very messy combination of CRLF and LF EOLs, which are hard to patch. - Refresh all patches from the original locations. - Add CVE-2024-39705-disable-download.patch to make a crude workaround around CVE-2024-39705 (gh#nltk/nltk#3266, bsc#1227174).
Dominique Leuenberger (dimstar_suse)
accepted
request 1077159
from
Factory Maintainer (factory-maintainer)
(revision 14)
Automatic submission by obs-autosubmit
Dominique Leuenberger (dimstar_suse)
accepted
request 1056667
from
Markéta Machová (mcalabkova)
(revision 13)
Dominique Leuenberger (dimstar_suse)
accepted
request 1045543
from
Matej Cepl (mcepl)
(revision 12)
- Complete nltk_data.tar.xz for offline testing - Fix failing tests (gh#nltk/nltk#2969) by adding patches: - port-2to3.patch - skip-networked-test.patch - Clean up the SPEC to get rid of rpmlint warnings.
Dominique Leuenberger (dimstar_suse)
accepted
request 965220
from
Dirk Mueller (dirkmueller)
(revision 11)
- Update to 3.7 - Improve and update the NLTK team page on nltk.org (#2855, #2941) - Drop support for Python 3.6, support Python 3.10 (#2920) - Update to 3.6.7 - Resolve IndexError in `sent_tokenize` and `word_tokenize` (#2922) - Update to 3.6.6 - Refactor `gensim.doctest` to work for gensim 4.0.0 and up (#2914) - Add Precision, Recall, F-measure, Confusion Matrix to Taggers (#2862) - Added warnings if .zip files exist without any corresponding .csv files. (#2908) - Fix `FileNotFoundError` when the `download_dir` is a non-existing nested folder (#2910) - Rename omw to omw-1.4 (#2907) - Resolve ReDoS opportunity by fixing incorrectly specified regex (#2906, bsc#1191030, CVE-2021-3828). - Support OMW 1.4 (#2899) - Deprecate Tree get and set node methods (#2900) - Fix broken inaugural test case (#2903) - Use Multilingual Wordnet Data from OMW with newer Wordnet versions (#2889) - Keep NLTKs "tokenize" module working with pathlib (#2896) - Make prettyprinter to be more readable (#2893) - Update links to the nltk book (#2895) - Add `CITATION.cff` to nltk (#2880) - Resolve serious ReDoS in PunktSentenceTokenizer (#2869) - Delete old CI config files (#2881)
Dominique Leuenberger (dimstar_suse)
accepted
request 812413
from
Tomáš Chvátal (scarabeus_iv)
(revision 10)
- Update to v3.5 * add support for Python 3.8 * drop support for Python 2 * create NLTK's own Tokenizer class distinct from the Treebank reference tokeniser * update Vader sentiment analyser * fix JSON serialization of some PoS taggers * minor improvements in grammar.CFG, Vader, pl196x corpus reader, StringTokenizer * change implementation <= and >= for FreqDist so they are partial orders * make FreqDist iterable * correctly handle Penn Treebank trees with a unlabeled branching top node
Dominique Leuenberger (dimstar_suse)
accepted
request 787913
from
Dirk Mueller (dirkmueller)
(revision 9)
- Update to 3.4.5 (bsc#1146427, CVE-2019-14751):
Dominique Leuenberger (dimstar_suse)
accepted
request 784877
from
Tomáš Chvátal (scarabeus_iv)
(revision 8)
- Fix build without python2
Dominique Leuenberger (dimstar_suse)
accepted
request 738364
from
Matej Cepl (mcepl)
(revision 7)
Replace %fdupes -s with plain %fdupes; hardlinks are better.
Ludwig Nussel (lnussel_factory)
accepted
request 730102
from
Tomáš Chvátal (scarabeus_iv)
(revision 6)
- Update to 3.4.5: * Fixed security bug in downloader: Zip slip vulnerability - for the unlikely situation where a user configures their downloader to use a compromised server CVE-2019-14751
Dominique Leuenberger (dimstar_suse)
accepted
request 717915
from
Tomáš Chvátal (scarabeus_iv)
(revision 5)
- Update to 3.4.4: * fix bug in plot function (probability.py) * add improved PanLex Swadesh corpus reader * add Text.generate() * add QuadgramAssocMeasures * add SSP to tokenizers * return confidence of best tag from AveragedPerceptron * make plot methods return Axes objects * don't require list arguments to PositiveNaiveBayesClassifier.train * fix Tree classes to work with native Python copy library * fix inconsistency for NomBank * fix random seeding in LanguageModel.generate * fix ConditionalFreqDist mutation on tabulate/plot call * fix broken links in documentation * fix misc Wordnet issues * update installation instructions
Dominique Leuenberger (dimstar_suse)
accepted
request 705020
from
Tomáš Chvátal (scarabeus_iv)
(revision 4)
Dominique Leuenberger (dimstar_suse)
accepted
request 603179
from
Tomáš Chvátal (scarabeus_iv)
(revision 2)
- Trim redundant wording from description.
Dominique Leuenberger (dimstar_suse)
accepted
request 583014
from
Atri Bhattacharya (badshah400)
(revision 1)
NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets and tutorials supporting research and development in Natural Language Processing.
Displaying all 19 revisions