Request 1084865 (accepted)

Overview

Request 1084865 accepted

HAS TO GO TOGETHER WITH sr#1084806!

- Update to version 5.0.7:
* Previously, uproot.dask would default to step_size="100 MB" if
open_files=True and whole-file-steps (limit on step size) if
open_files=False. Now both open_files cases default to
steps_per_file=1 (whole-file-steps) for uniformity.
* feat: add in capability for blindly splitting files into
chunks for dask (gh#scikit-hep/uproot5#876).
- Drop tests.tar.xz as additional Source as test files are now
included with upstream pypi tarball; drop associated _service.
- Update required version of scikit-hep-testdata to 0.4.30 for
tests.
- Disable building against python3.11 until numba -- and dask --
is compatible.
- Disable 32-bit builds since python-awkward no longer builds for
those archs.
- Clean up SPEC file
- update to 5.0.3:
* feat: Infrastructure for writing of RNTuple (incomplete
functionality)
* feat: [WIP] RNTuple Basic Writing
* fix: an uproot.dask test was wrong; revealed by new dask-
awkward.
* fix: separate AwkwardForth machine for each TBranch context.
* fix: separate ZstdDecompressor for each thread.
* fix: uproot.dask: Protect against `branches=None` in
`project_columns`
* fix: AsStridedObjects.awkward_form was still including the
'@' members.
* fix: protect Uproot's 'project_columns' from Dask node names.
* Uproot version 5 has a few major new features, one removal
(`uproot.lazy`), and is based on Awkward Array version 2
instead of version 1.
* @kkothari2001 upgraded Uproot from Awkward version 1 to
version 2, the major part of which was replacing
`uproot.lazy`, which is based on Awkward 1's virtual and
partitioned lazy arrays, with the new Dask collection, dask-
awkward. The entry point for this function is `uproot.dask`.
* @kkothari2001 also simplified Uproot's Pandas backend, which
used to "explode" ragged arrays from ROOT into Pandas
DataFrames with a non-trivial MultiIndex. Now, it takes
advantage of awkward-pandas to put ragged (and more complex)
Awkward Arrays directly into Pandas columns.
* If you want the old behavior, you can read data using
`library="ak"` to get an Awkward Array, and use
ak.to_dataframe to "explode" the data into a MultiIndex.
* @aryan26roy added a new code path to the TTree-reading
routines to read them with AwkwardForth instead of pure
Python. Users won't see any _interface_ changes due to this
code, but the performance of reading TBranches with
`AsObject` or `AsStrings` Interpretations should be orders of
magnitude faster. For example, `std::vector>` reading is now
400× faster.
* @Moelf added a complete reader of RNTuple data with most of
an RNTuple-writer in an unmerged pull request (#705).
Although the RNTuple format is still in development, this is
a very good start at reading RNTuple data, whose structure is
a close match to Awkward Arrays (so the translation is more
one-to-one than it is for TTrees, for instance).
* feat: move to hatchling
* feat: `from_map` like optimization for dask arrays
* feat: Finalizing AwkwardForth reader for Uproot
* feat: implemented NON-memberwise deserialization for AsMap.
* feat: Added column_projection optimization
* feat: support categorical axes on boost histograms
* feat: warn about TBranch name, alias name conflict.
* feat: any Mapping assigned to a WritableDirectory is
interpreted as a TTree or failure, no fall-through.
* feat: add 'interp_options' mechanism and ak_add_doc.
* feat: Use awkward pandas, instead of the existing code that
explodes Pandas Dataframes
* feat: made 'very optional' arguments keyword-only
* feat: adjust for name change in scikit-hep/awkward#1919.
* fix: depend on packaging, not setuptools vendored packaging
* fix: Avoid triggering temporary dask-awkward/awkward
incompatibility.
* fix: Do not write incorrect fSumw2 in histograms (v5).
* fix: Fixes uproot.dask bug with empty branches
* fix: Use `from_map` optimization for delayed numpy arrays and
add tests with empty branches for the same
* fix: use ctx manager to ensure resources are freed
* fix: ReadOnlyDirectory should provide the largest abs(cycle)
when cycle is unspecified, not the largest cycle.
* fix: regularize ROOT type aliases to C fundamental type
names.
* fix: avoid empty TBasket issue in embedded TBasket
* fix: don't use Awkward in test_0751 that doesn't need it
* fix: working TList serialization
* fix: histogram weights not handled correctly in hist / boost
conversion
* perf: streamline metadata handling for TBranch name lookup
and uproot.dask
* fix: ensure AwkwardForth fallback path is tested without
history.
* fix: all AwkwardForth Forms now agree with awkward_form
method output.
* fix: Uproot tests now work with Awkward 2.0.0.
- drop uproot-use-packaging-module.patch (upstream)
- Update to version 4.3.4:
* fixed uninitialized attributes of ReadOnlyDirectory
[gh#scikit-hep/uproot4#661].
- Refresh uproot-use-packaging-module.patch to correct use of
packaging.version.
- Update to version 4.3.3:
* Fixed an O(n²) scaling bug in getting data from TDirectories
so that now it's O(n) [gh#scikit-hep/uproot4#639].
- Update to version 4.3.0:
* Added a TMatrixTSym model (long stalled PR):
gh#scikit-hep/uproot4#484.
* Restored the argument list of Interpretation.awkward_form, a
public function: PR gh#scikit-help/uproot4#618.
- Refresh tests tarball with updated dir from git repository.
- Update to version 4.2.2:
* Restore performance hack for AsArray(True, False,
AsVector(False, dtype)) [gh#scikit-hep/uproot4#572].
- Update to version 4.2.1:
* Added a rule to skip parsing Float16/Double32 TBranch titles
if the title is not parsable (and just assume default number
of bits) [gh#scikit-hep/uproot4#561].
* Removed references to deprecated distutils and Pandas
Int64Index [gh#scikit-hep/uproot4#564].
* Removed the rule that interpreted fBits as 1 byte (it's 4
bytes everywhere except in some branches of some Delphes
files) [gh#scikit-hep/uproot4#570].
- Add uproot-use-packaging-module.patch -- Use packaging module
directly instead of calling it via setup.extern; the latter does
not work on openSUSE directly.
- Introduce Requires and BuildRequires (for tests) on
python-packaging in light of above patch.
- Update tests.tar.xz to tagged 4.2.1 version.
- Update to version 4.2.0:
* Drop Python2isms from the codebase:
[gh#scikit-hep/uproot#526].
* Make writing to Python file handles work (fixed a
half-finished, forgotten implementation):
[gh#scikit-hep/uproot#538].
* Fix cut jagged-array corner-case in library="pd".
* Fix a case in which the instance version is 0, but the
streamer version is not: [gh#scikit-hep/uproot#537].
* Fix uproot.WritableTTree.extend when the metadata needs to be
rewritten: [gh#scikit-hep/uproot#547].
* When checking to see if something in file["name"] = something
is an Awkward Array or Pandas DataFrame for creating a TTree,
also check for superclasses (the whole mro):
[gh#scikit-hep/uproot#557].
- Sync tests tarball with version 4.2.0.
- Update to version 4.1.9:
* [gh#scikit-hep/uproot#523] docs: add pfackeldey as a
contributor for code.
* [gh#scikit-hep/uproot#522] [pre-commit.ci] pre-commit
autoupdate.
* [gh#scikit-hep/uproot#506] add uproot.model.TTable.data
getter.
* [gh#scikit-hep/uproot#521] Dynamic classes can’t be ABC
subclasses, such as Sequence.
* [gh#scikit-hep/uproot#519] [MemmapSource] Remove
unnecessary(?) copies.
* [gh#scikit-hep/uproot#517] Be more careful about identifying
pd.DataFrame.
- Update tests.tar.xz to version 4.1.9.
- Disable an additional test on 32-bit that depends on 64-bit long.
- Update to v4.0.11
* see https://uproot.readthedocs.io/en/latest/changelog.html
- Update runtime and builtime requirements:
* Add awkward as recommendation (upstream: "highly recommended")
* Add other optional package as suggestions
* BuildRequire what is available and used for offline tests
- Skip some pandas tests which test for 64bit types on 32bit builds
- use the pytest network mark in order to skip network tests on obs
- add a network mark to test_0220 -- gh#scikit-hep/uproot4#396
- Disable builds for python 3.6: no numpy.
- Initial package.
- Package tests subdir from upstream git since they are missing
from the PyPI src tarballs.