Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
home:mslacken:ml
apache-arrow
python-pyarrow.changes
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File python-pyarrow.changes of Package apache-arrow
------------------------------------------------------------------- Thu Sep 26 23:24:22 UTC 2024 - Guang Yee <gyee@suse.com> - Enable sle15_python_module_pythons. ------------------------------------------------------------------- Wed Aug 14 20:27:48 UTC 2024 - Ben Greiner <code@bnavigator.de> - Update to 17.0.0 ## Bug Fixes * [C++][Python] Fix casting to extension type with fixed size list storage type (#42219) * [Python] Include metadata when creating pa.schema from PyCapsule (#41538) * [C++][Python] RecordBatch.filter() segfaults if passed a ChunkedArray (#40971) * [Python] pa.array: add check for byte-swapped numpy arrays inside python objects (#41549) * [Python] Fix read_table for encrypted parquet (#39438) * [Python] RunEndEncodedArray.from_arrays: bugfix for Array arguments (#40560) (#41093) * [C++][Python] Map child Array constructed from keys and items shouldn’t have offset (#40871) * [Python] `test_numpy_array_protocol` test failures with numpy 2.0.0rc1 * [Python] Fix StructArray.sort() for by=None (#41495) * [Python] Build with Python 3.13 (#42034) * [Python] remove special methods related to buffers in python <2.6 (#41492) * [Python] Fix reading column index with decimal values (#41503) * [Docs][Python] Remove duplicate contents (#41588) * [C++][Python] Add optional null_bitmap to MapArray::FromArrays (#41757) * [Python][Parquet] Implement to_dict method on SortingColumn (#41704) * [Python] CMake: ignore Parquet encryption option if Parquet itself is not enabled (fix Java integration build) (#41776) * [Python] Disallow direct pa.RecordBatchReader() construction to avoid segfaults (#41773) * [Python] Fix RecordBatchReader.cast to support casting to equal schema for all types (#42098) * [Python] Fix tests when using NumPy 2.0 on Windows (#42099) * [CI][Python] Use pip install -e instead of setup.py build_ext –inplace for installing pyarrow on verification script (#42007) * [CI][Python][C++] Fix utf8proc detection for wheel on Windows (#42022) * [Python][CI] Update expected output for numpy 2.0.0 (#42172) ## New Features and Improvements * [Python] Replace pandas.util.testing.rands with vendored version (#42089) * [Python] begin moving static settings to pyproject.toml (#41041) * [Python] Implement PyCapsule interface for Device data in PyArrow (#40717) * [Python] Expand the Arrow PyCapsule Interface with C Device Data support (#40708) * [Python] Let RecordBatch.filter accept a boolean expression in addition to mask array (#43043) * [Python] Fix pickling of LocalFileSystem for cython 2 (#41459) * [Python] Expand the C Device Interface bindings to support import on CUDA device (#40385) * [Python] Allow passing a mapping of column names to rename_columns (#40645) * [Python][Packaging] Strip unnecessary symbols when building wheels (#42028) * [Python][Docs] Update PyArrow installation docs for conda package split (#41135) * [Python] Basic bindings for Device and MemoryManager classes (#41685) * [C++][Python] Expose recursive flatten for lists on list_flatten kernel function and pyarrow bindings (#41295) * [Python][Packaging] Ensure to build with released numpy 2.0 (instead of RC) in the wheel building workflows (#42194) * [CI][Python] Add a job on ARM64 macOS (#41313) * [CI][Python] Reduce CI time on macOS (#41378) * [Python] Expose byte_width and bit_width of ExtensionType in terms of the storage type (#41413) * [Python] Update Python development guide about components being enabled by default based on Arrow C++ (#41705) * [Python] Building PyArrow: enable/disable python components by default based on availability in Arrow C++ (#41494) * [C++][Python] Extends the add_key_value to parquet::arrow and PyArrow (#41633) * [Python] Ensure Buffer methods don’t crash with non-CPU data (#41889) * [C++][Python] PrettyPrint non-cpu data by copying to default CPU device (#42010) * [Python][Parquet] Update BYTE_STREAM_SPLIT description in write_table() docstring (#41759) * [Python] Add support for Pyodide (#37822) * [Python] Fix pandas tests to follow downstream datetime64 unit changes (#41979) * [Python] Allow Array.filter() to take general array input (#42051) * [Python] Expose new FLOAT16 logical type in the pyarrow.parquet bindings (#42103) * [Python] Array gracefully fails on non-cpu device (#42113) * [Python][Parquet] Pyarrow store decimal as integer (#42169) * [Python] Add CI job for Numpy 1.X (#42189) * [CI][Python] Pin openjdk=17 in python substrait integration (#43051) - Drop pyarrow-pr41319-numpy2-tests.patch - Add pyarrow-pr433325-extradirs.patch gh#apache/arrow/pull/43325 ------------------------------------------------------------------- Thu Apr 25 08:58:22 UTC 2024 - Ben Greiner <code@bnavigator.de> - Update to 16.0.0 * [Python] construct pandas.DataFrame with public API in to_pandas (#40897) * [Python] Fix ORC test segfault in the python wheel windows test (#40609) * [Python] Attach Python stacktrace to errors in ConvertPyError (#39380) * [Python] Plug reference leaks when creating Arrow array from Python list of dicts (#40412) * [Python] Empty slicing an array backwards beyond the start is now empty (#40682) * [Python] Slicing an array backwards beyond the start now includes first item. (#39240) * [Python] Calling pyarrow.dataset.ParquetFileFormat.make_write_options as a class method results in a segfault (#40976) * [Python] Fix parquet import in encryption test (#40505) * [Python] fix raising ValueError on _ensure_partitioning (#39593) * [Python] Validate max_chunksize in Table.to_batches (#39796) * [C++][Python] Fix test_gdb failures on 32-bit (#40293) * [Python] Make Tensor.__getbuffer__ work on 32-bit platforms (#40294) * [Python] Avoid using np.take in Array.to_numpy() (#40295) * [Python][C++] Fix large file handling on 32-bit Python build (#40176) * [Python] Update size assumptions for 32-bit platforms (#40165) * [Python] Fix OverflowError in foreign_buffer on 32-bit platforms (#40158) * [Python] Add Type_FIXED_SIZE_LIST to _NESTED_TYPES set (#40172) * [Python] Mark ListView as a nested type (#40265) * [Python] only allocate the ScalarMemoTable when used (#40565) * [Python] Error compiling Cython files on Windows during release verification * [Python] Fix flake8 failures in python/benchmarks/parquet.py (#40440) * [Python] Suppress python/examples/minimal_build/Dockerfile.* warnings (#40444) * [Python][Docs] Add workaround for autosummary (#40739) * [Python] BUG: Empty slicing an array backwards beyond the start should be empty * [CI][Python] Activate ARROW_PYTHON_VENV if defined in sdist-test job (#40707) * [CI][Python] CI failures on Python builds due to pytest_cython (#40975) * [Python] ListView pandas tests should use np.nan instead of None (#41040) * [C++][Python] Sporadic asof_join failures in PyArrow ## New Features and Improvements * [Python][CI] Remove legacy hdfs tests from hdfs and hypothesis setup (#40363) * [Python] Remove deprecated pyarrow.filesystem legacy implementations (#39825) * [C++][Python] Add missing methods to RecordBatch (#39506) * [Python][CI] Support ORC in Windows wheels * [Python] Correct test marker for join_asof tests (#40666) * [Python] Add join_asof binding (#34234) * [Python] Add a function to download and extract timezone database on Windows (#38179) * [Python][CI][Packaging] Enable ORC on Windows Appveyor CI and Windows wheels for pyarrow * [Python] Add a FixedSizeTensorScalar class (#37533) * [Python][CI][Dev][Python] Release and merge script errors (#37819)" (#40150) * [Python] Construct pyarrow.Field and ChunkedArray through Arrow PyCapsule Protocol (#40818) * [Python] Fix missing byte_width attribute on DataType class (#39592) * [Python] Compatibility with NumPy 2.0 * [Packaging][Python] Enable building pyarrow against numpy 2.0 (#39557) * [Python] Basic pyarrow bindings for Binary/StringView classes (#39652) * [Python] Expose force_virtual_addressing in PyArrow (#39819) * [Python][Parquet] Support hashing for FileMetaData and ParquetSchema (#39781) * [Python] Add bindings for ListView and LargeListView (#39813) * [Python][Packaging] Build pyarrow wheels with numpy RC instead of nightly (#41097) * [Python] Support creating Binary/StringView arrays from python objects (#39853) * [Python] ListView support for pa.array() (#40160) * [Python][CI] Remove upper pin on pytest (#40487) * [Python][FS][Azure] Minimal Python bindings for AzureFileSystem (#40021) * [Python] Low-level bindings for exporting/importing the C Device Interface (#39980) * [Python] Add ChunkedArray import/export to/from C (#39985) * [Python] Use Cast() instead of CastTo (#40116) * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor (#40064) * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor - add support for different data types (#40359) * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor - add option to cast NULL to NaN (#40803) * [Python] Support requested_schema in __arrow_c_stream__() (#40070) * [Python] Support Binary/StringView conversion to numpy/pandas (#40093) * [Python] Allow FileInfo instances to be passed to dataset init (#40143) * [Python][CI] Add 32-bit Debian build on Crossbow (#40164) * [Python] ListView arrow-to-pandas conversion (#40482) * [Python][CI] Disable generating C lines in Cython tracebacks (#40225) * [Python] Support construction of Run-End Encoded arrays in pa.array(..) (#40341) * [Python] Accept dict in pyarrow.record_batch() function (#40292) * [Python] Update for NumPy 2.0 ABI change in PyArray_Descr->elsize (#40418) * [Python][CI] Fix install of nightly dask in integration tests (#40378) * [Python] Fix byte_width for binary(0) + fix hypothesis tests (#40381) * [Python][CI] Fix dataset partition filter tests with pandas nightly (#40429) * [Docs][Python] Added JsonFileFormat to docs (#40585) * [Dev][C++][Python][R] Use pre-commit for clang-format (#40587) * [Python][C++] Support conversion of pyarrow.RunEndEncodedArray to numpy/pandas (#40661) * [Python] Simplify and improve perf of creation of the column names in Table.to_pandas (#40721) * [Docs][C++][Python] Add initial documentation for RecordBatch::Tensor conversion (#40842) * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor - add support for row-major (#40867) * [CI][Python] check message in test_make_write_options_error for Cython 2 (#41059) * [Python] Add copy keyword in Array.array for numpy 2.0+ compatibility (#41071) * [Python][Packaging] PyArrow wheel building is failing because of disabled vcpkg install of liblzma - Drop apache-arrow-pr40230-glog-0.7.patch - Drop apache-arrow-pr40275-glog-0.7-2.patch - Add pyarrow-pr41319-numpy2-tests.patch gh#apache/arrow#41319 ------------------------------------------------------------------- Sat Mar 23 15:23:23 UTC 2024 - Ben Greiner <code@bnavigator.de> - Update to 15.0.2 ## Bug Fixes * [Python] Fix except clauses (#40387) * [Python][CI] Skip failing test_dateutil_tzinfo_to_string (#40486) ------------------------------------------------------------------- Wed Feb 28 12:12:36 UTC 2024 - Ben Greiner <code@bnavigator.de> - Move to science/apache-arrow as multibuild package - Also needs the cpp GLOG patches * Add apache-arrow-pr40230-glog-0.7.patch * Add apache-arrow-pr40275-glog-0.7-2.patch ------------------------------------------------------------------- Fri Feb 23 17:35:37 UTC 2024 - Ben Greiner <code@bnavigator.de> - Update to 15.0.1 ## Bug Fixes * [Python] Fix race condition in _pandas_api#_check_import (#39314) * [Python] Avoid leaking references to Numpy dtypes (#39636) * [Release] Update platform tags for macOS wheels to macosx_10_15 (#39657) * [Python][CI] Fix test failures with latest/nightly pandas (#39760) * [C#] Restore support for .NET 4.6.2 (#40008) * [Python] Make capsule name check more lenient (#39977) * [Python][FlightRPC] Release GIL in GeneratorStream (#40005) ## New Features and Improvements * [Python] Remove the use of pytest-lazy-fixture (#39850) * [Python][CI] Pin moto<5 for dask integration tests (#39881) * [Python] Fix tests for pandas with CoW / nightly integration tests (#40000) - Release 15.0.0 ## Bug Fixes * [C++][Python] Add a no-op kernel for dictionary_encode(dictionary) (#38349) * [Python] Fix S3FileSystem equals None segfault (#39276) * Fix TestArrowReaderAdHoc.ReadFloat16Files to use new uncompressed files (#38825) * [Python] Fix spelling (#38945) * [CI][Python] Update pandas tests failing on pandas nightly CI build (#39498) * [CI][JS] Force node 20 on JS build on arm64 to fix build issues (#39499) ## New Features and Improvements * [C++][Python] Add "Z" to the end of timestamp print string when tz defined (#39272) * [Python] Remove the legacy ParquetDataset custom python-based implementation (#39112) * [Python] add Table.to/from_struct_array (#38520) * [C++][Python] DLPack implementation for Arrow Arrays (producer) (#38472) * [Python] FixedSizeListArray.from_arrays supports mask parameter (#39396) * [C++][Python][R] Allow users to adjust S3 log level by environment variable (#38267) * [Python] Expose Parquet sorting metadata (#37665) * [C++][Python][Parquet] Implement Float16 logical type (#36073) * [Python] Make CacheOptions configurable from Python (#36627) * [Python][Parquet] Parquet Support write and validate Page CRC (#38360) * [Python][Dataset] Expose file size to python dataset (#37868) * [R] Allow code() to return package name prefix. (#38144) * [Python] Remove usage of pandas internals DatetimeTZBlock (#38321) * Add validation logic for offsets and values to arrow.array.ListArray.fromArrays (#38531) * [Python][Compute] Describe strptime format semantics (#38665) * [Python] Remove dead code in _reconstruct_block (#38714) * [Python] Fix append mode for cython 2 (#39027) * [Python] Add append mode for pyarrow.OsFile (#38820) * [Python] Extract libparquet requirements out of libarrow_python.so to new libarrow_python_parquet_encryption.so (#39316) * Create module info compiler plugin (#39135) * [Python] RecordBatchReader.from_stream constructor for objects implementing the Arrow PyCapsule protocol (#39218) * [Python] Pass in type to MapType.from_arrays (#39516) * [Python][CI] Skip failing dask tests: test_describe_empty and test_view (#39534) * [Python] NumPy 2.0 compat: remove usage of np.core (#39535) * [Packaging][Python] Add a numpy<2 pin to the install requirements for the 15.x release branch (#39538) ------------------------------------------------------------------- Mon Jan 15 20:42:25 UTC 2024 - Ben Greiner <code@bnavigator.de> - Update to 14.0.2 ## New Features and Improvements * GH-38342 - [Python] Update to_pandas to use non-deprecated DataFrame constructor (#38374) * GH-38364 - [Python] Initialize S3 on first use (#38375) ## Bug Fixes * GH-38345 - [Release] Use local test data for verification if possible (#38362) * GH-38577 - Reading parquet file behavior change from 13.0.0 to 14.0.0 * GH-38626 - [Python] Fix segfault when PyArrow is imported at shutdown (#38637) * GH-38676 - [Python] Fix potential deadlock when CSV reading errors out (#38713) * GH-38984 - [Python][Packaging] Verification of wheels on AlmaLinux 8 are failing due to missing pip (#38985) * GH-39074 - [Release][Packaging] Use UTF-8 explicitly for KEYS (#39082) ------------------------------------------------------------------- Tue Nov 14 23:29:03 UTC 2023 - Ondřej Súkup <mimi.vx@gmail.com> - Fix cve in changelog ------------------------------------------------------------------- Tue Nov 14 09:28:23 UTC 2023 - Ondřej Súkup <mimi.vx@gmail.com> - Update to 14.0.1 - drop pyarrow-pr37481-pandas2.1.patch - fixes boo#1216991 CVE-2023-47248 * GH-38431 - [Python][CI] Update fs.type_name checks for s3fs tests * GH-38607 - [Python] Disable PyExtensionType autoload - update to 14.0.0 * very long list of changes can be found here: https://arrow.apache.org/release/14.0.0.html ------------------------------------------------------------------- Thu Aug 31 18:43:55 UTC 2023 - Ben Greiner <code@bnavigator.de> - Update to 13.0.0 ## Compatibility notes: * The default format version for Parquet has been bumped from 2.4 to 2.6 GH-35746. In practice, this means that nanosecond timestamps now preserve its resolution instead of being converted to microseconds. * Support for Python 3.7 is dropped GH-34788 ## New features: * Conversion to non-nano datetime64 for pandas >= 2.0 is now supported GH-33321 * Write page index is now supported GH-36284 * Bindings for reading JSON format in Dataset are added GH-34216 * keys_sorted property of MapType is now exposed GH-35112 ## Other improvements: * Common python functionality between Table and RecordBatch classes has been consolidated ( GH-36129, GH-35415, GH-35390, GH-34979, GH-34868, GH-31868) * Some functionality for FixedShapeTensorType has been improved (__reduce__ GH-36038, picklability GH-35599) * Pyarrow scalars can now be accepted in the array constructor GH-21761 * DataFrame Interchange Protocol implementation and usage is now documented GH-33980 * Conversion between Arrow and Pandas for map/pydict now has enhanced support GH-34729 * Usability of pc.map_lookup / MapLookupOptions is improved GH-36045 * zero_copy_only keyword can now also be accepted in ChunkedArray.to_numpy() GH-34787 * Python C++ codebase now has linter support in Archery and the CI GH-35485 ## Relevant bug fixes: * __array__ numpy conversion for Table and RecordBatch is now corrected so that np.asarray(pa.Table) doesn’t return a transposed result GH-34886 * parquet.write_to_dataset doesn’t create empty files for non-observed dictionary (category) values anymore GH-23870 * Dataset writer now also correctly follows default Parquet version of 2.6 GH-36537 * Comparing pyarrow.dataset.Partitioning with other type is now correctly handled GH-36659 * Pickling of pyarrow.dataset PartitioningFactory objects is now supported GH-34884 * None schema is now disallowed in parquet writer GH-35858 * pa.FixedShapeTensorArray.to_numpy_ndarray is not failing on sliced arrays GH-35573 * Halffloat type is now supported in the conversion from Arrow list to pandas GH-36168 * __from_arrow__ is now also implemented for Array.to_pandas for pandas extension data types GH-36096 - Add pyarrow-pr37481-pandas2.1.patch gh#apache/arrow#37481 ------------------------------------------------------------------- Fri Aug 25 12:52:17 UTC 2023 - Ben Greiner <code@bnavigator.de> - Limit to Cython < 3 ------------------------------------------------------------------- Mon Jun 12 12:22:31 UTC 2023 - Ben Greiner <code@bnavigator.de> - Update to 12.0.1 ## Bug Fixes * [GH-35389] - [Python] Fix coalesce_keys=False option in join operation (#35505) * [GH-35821] - [Python][CI] Skip extension type test failing with pandas 2.0.2 (#35822) * [GH-35845] - [CI][Python] Fix usage of assert_frame_equal in test_hdfs.py (#35842) ## New Features and Improvements * [GH-35329] - [Python] Address pandas.types.is_sparse deprecation (#35366) - Drop pyarrow-pr35822-pandas2-extensiontype.patch ------------------------------------------------------------------- Wed Jun 7 07:39:44 UTC 2023 - Ben Greiner <code@bnavigator.de> - Skip invalid pandas 2 test * pyarrow-pr35822-pandas2-extensiontype.patch * gh#apache/arrow#35822 * gh#apache/arrow#35839 ------------------------------------------------------------------- Thu May 18 07:28:28 UTC 2023 - Ben Greiner <code@bnavigator.de> - Update to 12.0.0 ## Compatibility notes: * Plasma has been removed in this release (GH-33243). In addition, the deprecated serialization module in PyArrow was also removed (GH-29705). IPC (Inter-Process Communication) functionality of pyarrow or the standard library pickle should be used instead. * The deprecated use_async keyword has been removed from the dataset module (GH-30774) * Minimum Cython version to build PyArrow from source has been raised to 0.29.31 (GH-34933). In addition, PyArrow can now be compiled using Cython 3 (GH-34564). ## New features: * A new pyarrow.acero module with initial bindings for the Acero execution engine has been added (GH-33976) * A new canonical extension type for fixed shaped tensor data has been defined. This is exposed in PyArrow as the FixedShapeTensorType (GH-34882, GH-34956) * Run-End Encoded arrays binding has been implemented (GH-34686, GH-34568) * Method is_nan has been added to Array, ChunkedArray and Expression (GH-34154) * Dataframe interchange protocol has been implemented for RecordBatch (GH-33926) ## Other improvements: * Extension arrays can now be concatenated (GH-31868) * get_partition_keys helper function is implemented in the dataset module to access the partitioning field’s key/value from the partition expression of a certain dataset fragment (GH-33825) * PyArrow Array objects can now be accepted by the pa.array() constructor (GH-34411) * The default row group size when writing parquet files has been changed (GH-34280) * RecordBatch has the select() method implemented (GH-34359) * New method drop_column on the pyarrow.Table supports passing a single column as a string (GH-33377) * User-defined tabular functions, which are a user-functions implemented in Python that return a stateful stream of tabular data, are now also supported (GH-32916) * Arrow Archery tool now includes linting of the Cython files (GH-31905) * Breaking Change: Reorder output fields of “group_by” node so that keys/segment keys come before aggregates (GH-33616) ## Relevant bug fixes: * Acero can now detect and raise an error in case a join operation needs too much bytes of key data (GH-34474) * Fix for converting non-sequence object in pa.array() (GH-34944) * Fix erroneous table conversion to pandas if table includes an extension array that does not implement to_pandas_dtype (GH-34906) * Reading from a closed ArrayStreamBatchReader now returns invalid status instead of segfaulting (GH-34165) * array() now returns pyarrow.Array and not pyarrow.ChunkedArray for columns with __arrow_array__ method and only one chunk so that the conversion of pandas dataframe with categorical column of dtype string[pyarrow] does not fail (GH-33727) * Custom type mapper in to_pandas now converts index dtypes together with column dtypes (GH-34283) ------------------------------------------------------------------- Wed Mar 29 13:25:55 UTC 2023 - Ben Greiner <code@bnavigator.de> - Fix tests expecting the jemalloc backend which was disabled in the apache-arrow package ------------------------------------------------------------------- Sun Mar 12 05:31:32 UTC 2023 - Ben Greiner <code@bnavigator.de> - Update to v11.0.0 * [Python][Doc] Add five more numpydoc checks to CI (#15214) * [Python][CI][Doc] Enable numpydoc check PR03 (#13983) * [Python] Expose flag to enable/disable storing Arrow schema in Parquet metadata (#13000) * [Python] Add support for reading record batch custom metadata API (#13041) * [Python] Add lazy Dataset.filter() method (#13409) * [Python] ParquetDataset to still take legacy code path when old filesystem is passed (#15269) * [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset (#14052) * [Python] Support lazy Dataset.filter * [Python] Order of columns in pyarrow.feather.read_table (#14528) * [Python] Construct MapArray from sequence of dicts (instead of list of tuples) (#14547) * [Python] Unify CMakeLists.txt in python/ (#14925) * [C++][Python] Implement list_slice kernel (#14395) * [C++][Python] Enable struct_field kernel to accept string field names (#14495) * [Python][C++] Add use\_threads to run\_substrait\_query * [Python][Docs] adding info about TableGroupBy.aggregation with empty list (#14482) * [Python] DataFrame Interchange Protocol for pyarrow Table * [Python] Drop older versions of Pandas (<1.0) (#14631) * [Python] Pass Cmake args to Python CPP * [Docs][Python] Improve docs for S3FileSystem (#14599) * [Python] Add missing value accessor to temporal types (#14746) * [Python] Expose time32/time64 scalar values (#14637) * [Python] Remove gcc 4.9 compatibility code (#14602) * [C++][Python] Support slicing to end in list_slice kernel (#14749) * [C++][Python] Support step >= 1 in list_slice kernel (#14696) * [Release][Python] Upload .wheel/.tar.gz for release not RC (#14708) * [Python] Expose Scalar.validate() (#15149) * [Python] PyArrow C++ header files no longer always included in installed pyarrow (#14656) * [Doc][Python] Update note about bundling Arrow C++ on Windows (#14660) * [Python] Reduce warnings during tests (#14729) * [Python] Expose reading a schema from an IPC message (#14831) * [Python] Expose QuotingStyle to Python (#14722) * [Python] Add (Chunked)Array sort() method (#14781) * [Python] Dataset.sort_by (#14976) * [Python] Avoid dependency on exec plan in Table.sort_by to fix minimal tests (#15268) * [Python] Remove auto generated pyarrow_api.h and pyarrow_lib.h (#15219) * [Python] Error if datetime.timedelta to pyarrow.duration conversion overflows (#13718) * [Python] to_pandas fails with FixedOffset timezones when timestamp_as_object is used (#14448) * [Python] Pass **kwargs in read_feather to to_pandas() (#14492) * [Python] Add python test for decimals to csv (#14525) * [Python] Test that reading of timedelta is stable (read_feather/to_pandas) (#14531) * [C++][Python] Improve s3fs error message when wrong region (#14601) * [Python][C++] Adding support for IpcWriteOptions to the dataset ipc file writer (#14414) * [Python] Support passing create_dir thru pq.write_to_dataset (#14459) * [CI][Python] Fix pandas master/nightly build failure related to timedelta (#14460) * [Python] Fix writing files with multi-byte characters in file name (#14764) * [Python] Handle pytest 8 deprecations about pytest.warns(None) * [Python] Remove ARROW_BUILD_DIR in building pyarrow C++ (#14498) * [Python] Honor default memory pool in Dataset scanning (#14516) * [Python] Fully support filesystem in parquet.write_metadata (#14574) * [Python] Check schema argument type in RecordBatchReader.from_batches (#14583) * [Python][Docs] PyArrow table join docstring typos for left and right suffix arguments (#14591) * [Python] pass back time types with correct type class (#14633) * [Python] Support filesystem parameter in ParquetFile (#14717) * [Python][Docs] Add missing CMAKE_PREFIX_PATH to allow setup.py CMake invocations to find Arrow CMake package (#14586) * [Python][CI] Add DYLD_LIBRARY_PATH to avoid requiring PYARROW_BUNDLE_ARROW_CPP on macOS job (#14643) * [Python] Don't crash when schema=None in FlightClient.do_put (#14698) * [Python] Change warnings to _warnings in _plasma_store_entry_point (#14695) * [CI][Python] Update nightly test-conda-python-3.7-pandas-0.24 to pandas >= 1.0 (#14714) * [CI][Python] Update spark test modules to match spark master (#14715) * [Python] Fix test_s3fs_wrong_region; set anonymous=True (#14716) * [Python][CI] Fix nightly job using pandas dev (temporarily skip tests) (#15048) * [Python] Quadratic memory usage of Table.to\_pandas with nested data * [Python] Fix pyarrow.get_libraries() order (#14944) * [Python] Fix segfault for dataset ORC write (#15049) * [Python][Docs] Update docstring for pyarrow.decompress (#15061) * [Python][CI] Dask nightly tests are failing due to fsspec bug (#15065) * [C++][Python][FlightRPC] Make DoAction truly streaming (#15118) * [Benchmarking][Python] Set ARROW_INSTALL_NAME_RPATH=ON for benchmark builds (#15123) * [Python][macOS] Use `@rpath` for libarrow_python.dylib (#15143) * [Python] Docstring test failure (#15186) * [Python] Don't use target_include_directories() for imported target (#33606) * [Python] Make CSV cancellation test more robust * [Python][CI] Python sdist installation fails with latest setuptools 58.5 * [Python] Missing bindings for existing\_data\_behavior makes it impossible to maintain old behavior * [Python] update trove classifiers to include Python 3.10 * [Release][Python] Use python -m pytest * [Python][C++] Non-deterministic segfault in "AMD64 MacOS 10.15 Python 3.7" build * [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns * [Python][Packaging] Set deployment target to 10.13 for universal2 wheels * [Python] Fix crash in take/filter of empty ExtensionArray * [Python] Move marks from fixtures to individual tests/params * [Python][CI] Requiring s3fs >= 2021.8 * [Python] Allow writing datasets using a partitioning that only specifies field_names * [Python] Table.from_arrays should raise an error when array is empty but names is not * [Python][Packaging] Pin minimum setuptools version for the macos wheels * [Python][Doc] Document nullable dtypes handling and usage of types_mapper in to_pandas conversion * [C++][Python] Fix unique/value_counts on empty dictionary arrays * [Python][CI] Fix tests using OrcFileFormat for Python 3.6 + orc not built * [Python] Fix FlightClient.do_action * [Python][Docs] Fix usage of sync scanner in dataset writing docs * [Packaging][Python] Python 3.9 installation fails in macOS wheel build * [CI][Python] Fix Spark integration failures * [Python] Fix version constraints in pyproject.toml * [Packaging][Python] Disable windows wheel testing for python 3.6 * [Python][C++] Segfault with read\_json when a field is missing * [Python] Support for set/list columns when converting from Pandas * [Python] Support converting nested sets when converting to arrow * [Python] Make filesystems compatible with fsspec * [C++][Python][R] Consolidate coalesce/fill_null * [Python][Doc] Document the fsspec wrapper for pyarrow.fs filesystems * [Python] Support core-site.xml default filesystem. * [Python] Improve HadoopFileSystem docstring * [Python][Doc] Document missing pandas to arrow conversions * [Python] Make SubTreeFileSystem print method more informative * [Doc][Python] Improve documentation regarding dealing with memory mapped files * [C++][Python] Implement a new scalar function: list_element * [Python] Allow creating RecordBatch from Python dict * [Python] Update HadoopFileSystem docs to clarify setting CLASSPATH env variable is required * [Python] Improve documentation on what 'use_threads' does in 'read_feather' * [C++][Python] Improve consistency of explicit C++ types in PyArrow files * [Doc][Python] Improve PyArrow documentation for new users * [C++][Python] Add CSV convert option to change decimal point * [Python][Packaging] Build M1 wheels for python 3.8 * [Release][Python] Verify python 3.8 macOS arm64 wheel * [Doc][Python] Switch ipc/io doc to use context managers * [Python] Mention alternative deprecation message for ParquetDataset.partitions * [C++][Python] Implement ExtensionScalar * [Packaging][Python] Skip test_cancellation test case on M1 * [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error * [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds * [Python] Fix docstrings * [Python] Expose copy_files in pyarrow.fs * [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook * [Python] Update deprecated pytest yield_fixture functions * [Python] Support for MapType with Fields * [Python][Docs] Improve filesystem documentation * [Python] Add dataset mark to test_parquet_dataset_deprecated_properties * . [Python] Preview data when printing tables * [C++][Python] Column projection pushdown for ORC dataset reading + use liborc for column selection * [C++][Python] Add support for new MonthDayNano Interval Type * [Doc][Python] Add documentation for unify_schemas * [C++][Python] Implement C data interface support for extension types * [Python] Allow more than numpy.array as masks when creating arrays * [Python] Correct TimestampScalar.as_py() and DurationScalar.as_py() docstrings * [Python] Migrate Python ORC bindings to use new Result-based APIs * [Python] Support tuples in unify_schemas * [C++][Python] Not providing a sort_key in the "select_k_unstable" kernel crashes * [C++][Python] Support cast of naive timestamps to strings * [Python] Update kernel categories in compute doc to match C++ * [C++][Python][R] Implement count distinct kernel * [Python] Allow unsigned integer index type in dictionary() type factory function * [Python] Missing Python tests for compute kernels * [Python][CI] Add support for python 3.10 * [C++][Python] Improve error message when trying use SyncScanner when requiring async * [Python] Extend CompressedInputStream to work with paths, strings and files * [Packaging][Python] Enable NEON SIMD optimization for M1 wheels * [C++][Python] Use std::move() explicitly for g++ 4.8.5 * [Python][Packaging] Use numpy 1.21.3 to build python 3.10 wheels for macOS and windows - Build via PEP517 ------------------------------------------------------------------- Mon Aug 22 07:06:44 UTC 2022 - John Vandenberg <jayvdb@gmail.com> - Update to v9.0.0 ------------------------------------------------------------------- Mon Jan 21 03:51:32 UTC 2019 - Todd R <toddrme2178@gmail.com> - Initial version for v0.13.0
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor