Revisions of onednn

buildservice-autocommit accepted request 1169605 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 29)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 1169303 from Alessandro de Oliveira Faria's avatar Alessandro de Oliveira Faria (cabelo) (revision 28)
- Update to 3.4.1:
  * Fixed an issue with caching and serialization of primitives in 
    deterministic mode (7ed604a)
  * Introduced memory descriptor serialization API 
    (4cad420, 929a27a, 9b848c8)
  * Fixed incorrect results in fp64 convolution and deconvolution 
    on Intel GPUs based on Xe-LPG architecture (ebe77b5, 0b399ac, 
    d748d64, 9f4f3d5, 21a8cae)
  * Fixed incorrect results in reorder with large sizes on 
    Intel CPUs and GPUs (69a111e, 4b72361, 74a343b)
  * Reduced creation time for deconvolution primitive on 
    Intel CPUs (bec487e, 1eab005)
  * Fixed performance regression in deconvolution on 
    Intel CPUs (fbe5b97, 1dd3c6a)
  * Removed dangling symblols from static builds 
    (e92c404, 6f5621a)
  * Fixed crash during platform detection on some 
    AArch64-based systems (406a079)
  * Fixed performance regression in int8 deconvolution on 
    Intel CPUs (7e50e15)
  * Fixed handling of zero points for matmul in verbose 
    logs converter (15c7916) 
- Update to 3.3.3:
- This is a patch release containing the following changes to v3.3.2:
  * Fixed performance regression in int8 convolutions on processors with Intel AVX-512 and Intel DL Boost support (a00661f)
  * Fixed race condition during library initialization on Intel Data Center GPU Max Series (7dfcd11)
  * Fixed accuracy issue in experimental Graph Compiler with LLVM code generator (8892e7e)
  * Disabled int8 RNN implementation for cases with non-trivial strides (2195e4b)
  * Fixed incorrect results in bfloat16 convolution implementation on processors with Intel AMX support (9f00af9)
  * Fixed incorrect results in fp16 and int8 convolution on Intel Core Ultra integrated GPUs (69cef84, 79bc6cc, c9c0b09)
- Update to 3.3.1:
- This is a patch release containing the following changes to v3.3:
  * Fixed int8 convolution accuracy issue on Intel GPUs (09c87c7)
  * Switched internal stream to in-order mode for NVIDIA and AMD GPUs to avoid synchronization issues (db01d62)
  * Fixed runtime error for avgpool_bwd operation in Graph API (d025ef6, 9e0602a, e0dc1b3)
  * Fixed benchdnn error reporting for some Graph API cases (98dc9db)
  * Fixed accuracy issue in experimental Graph Compiler for int8 MHA variant from StarCoder model (5476ef7)
  * Fixed incorrect results for layer normalization with trivial dimensions on Intel GPUs (a2ec0a0)
  * Removed redundant synchronization for out-of-order SYCL queues (a96e9b1)
  * Fixed runtime error in experimental Graph Compiler for int8 MLP subgraph from LLAMA model (595543d)
  * Fixed SEGFAULT in experimental Graph Compiler for fp32 MLP subgraph (4207105)
  * Fixed incorrect results in experimental Graph Compiler for MLP subgraph (57e14b5)
  * Fixed the issue with f16 inner product primitive with s8 output returning unimplemented on Intel GPUs (bf12207, 800b5e9, ec7054a)
  * Fixed incorrect results for int8 deconvolution with zero-points on processors with Intel AMX instructions support (55d2cec)
- Update to 3.3:
  * 3.3: https://github.com/oneapi-src/oneDNN/releases/tag/v3.3
  * 3.2: https://github.com/oneapi-src/oneDNN/releases/tag/v3.2
  * 3.1: https://github.com/oneapi-src/oneDNN/releases/tag/v3.1
- Drop upstreamed onednn-fix-gcc13.patch
- Update to 3.0.1:
  * Changes: https://github.com/oneapi-src/oneDNN/releases/tag/v3.0.1
- Skipped 3.0:
  * Changes: https://github.com/oneapi-src/oneDNN/releases/tag/v3.0
- Add patch to fix build with GCC13:
  * onednn-fix-gcc13.patch
- Disable Arm Compute library support until fixed upstream
  https://github.com/oneapi-src/oneDNN/issues/1599
- Drop upstream patches:
  * 1428.patch
  * fa93750.patch
- Add patch to fix build with latest Arm Compute Library:
  * 1428.patch
  * fa93750.patch (dep for 1428.patch)
- Update to 2.6.2:
  * https://github.com/oneapi-src/oneDNN/releases
- Removed onednn-1045.patch.
- Removed onednn-xbyak-aarch64.patch.
- Fix build on aarch64:
  * onednn-xbyak-aarch64.patch
- Update to version 2.2.4:
  * Fixed build error with GCC 11 (eda1add)
  * Fixed an issue with reorder reporting unimplemented when
    quantizing f32 weights to s8 (4f05b76, 5d3d1e1, cc77eef)
  * Updated name for GPU gen12 architecture to xe (3d202c2)
- Drop upstream patch:
  * 0001-common-gpu-include-thread-and-limit-headers-to-fix-G.patch
- Update to version 2.2.3
  * Fixed a bug in int8 depthwise convolution ptimitive with groups
    and 1d spatial size for processors with AVX-512 and AVX2 support
  * Fixed correctness issue for PReLU primitive
  * Fixed corretness issue in reorder for blocked layouts with
    zero padding
  * Improved performance of weights reorders used by BRGEMM-based
    convolution primitive for processors with AVX-512 support
  * Added -fp-model=precise build flag for DPC++ code
  * Fixed potential memory leak in matmul primitive
  * Fixed performance of matmul primitive when fused with bias
    update and sum
  * Fixed a bug in matmul primitive when writing to non-contiguous
    destination buffer
- Add upstream patch for GCC11 support
  * 0001-common-gpu-include-thread-and-limit-headers-to-fix-G.patch
- Update descriptions.
- Update to 2.2.2, changes:
  * Fixed performance regression in fp32 forward inner product for
  shapes with number of output channels equal to 1 for processors
  with Intel AVX-512 support (714b1fd)
  * Fixed performance regression in forward convolutions with groups
  for processors with Intel AVX-512 support(3555d4a)
  * Removed -std=c++11 build flag for DPC++ headers (1fcb867)
  * Fixed buffer access in initializing workspace in RNN
  implementation on GPU (9b03091)
  * Fixed fix a bug in convolution with 1x1 kernel and mixed
  strides on processors with Intel AVX-512 support (d0b3e3f)
  * Used getauxval for Linux to get CPU features on for AArch64
  systems (25c4cea)
  * Added -fp-model=precise build flag for DPC++ code (3e40e5e)
  * Fixed out-of-bounds writes in elementwise primitive on
  Intel Processor Graphics (bcf823c)
- Fix build with Arm Compute Library:
  * onednn-1045.patch
- Update to 2.2.1, changes:
  * From 2.2:
  Fixed segfault for cases when primitive descriptor or attributed contain NaN (e6d05ec, dbca1e9, 0326b09, 0326b09)
  Fixed engine creation failure for GPU subdevices (4c3a114)
  Fixed long lines clipping in verbose output (70d70a8)
  Fixed segfault in bfloat16 convolution weight gradient implementation on processors with Intel AMX support (a3a73a3)
  Fixed performance regression in binary primitive with per_oc broadcast strategy (9ac85d8)
  Worked around a bug with Microsoft Visual C++ compiler version detection in CMake 3.19 (2f39155)
  Removed -std=c++11 build flag for DPC++ code to align with SYCL standard (1b026f5)
  * Changes between 2.1 and 2.2:
  Performance Optimizations
    Intel Architecture processors
      Improved performance of int8 compute functionality for future Intel Xeon Scalable processor (code name Sapphire Rapids). The functionality is disabled by default and should be enabled via CPU dispatcher control.
      Improved performance of compute functionality for future Intel Core processor with Intel AVX2 and Intel DL Boost instructions support (code name Alder Lake).
      Improved fp32 inner product forward propagation performance for processors with Intel AVX-512 support.
      Improved dnnl_gemm performance for cases with n=1 on all supported processors.
    Intel Graphics products
      Introduced NHWC format support for activations for int8 primitives.
    AArch64-based processors
      Improved performance of fp32 and int8 convolution, and softmax primitives for processors with SVE 512 support.
      Improved performance of fp32 convolution via Arm Compute Library (ACL).
      Improved performance of convolution with a combination of sum and relu post-ops via ACL.
  Functionality
    Extended eltwise primitive with support for mish and hardswish algorithms.
    Extended binary primitive with support for comparison operators.
    Introduced support for post-ops in GPU resampling implementation.
    Introduced asymmetric quantization support for int8 deconvolution.
    Introduced binary post-ops support for matmul primitive.
  Usability
    Improved presentation of oneDNN primitives in VTune Amplifier.
    Introduced Linux perf support for AArch64.
    Introduced support for Fujitsu C++ compiler.
    Introduced a build time check for minimal supported ACL version. Currently oneDNN requires ACL 21.02 or later.
    Added support for cuDNN 8.x
- Update to 2.1
- Add Arm ComputeLibrary support on aarch64
- Obsoletes mkl-dnn* <= %{version}
- Rename mkl-dnn to onednn to follow upstream
- Update to 1.6.3
- Drop upstream patch:
  * cmake-no-install-ocl-cmake.patch
- Build on aarch64 and ppc64le which are now also supported
- Provide oneDNN and oneDNN-devel as it is the new official name
- Update to 1.4:
  * Performance improvements all over the board
- Rebase patch cmake-no-install-ocl-cmake.patch
- Add constraints to not crash during testing on OOM
- Do not disable LTO there is no actual reason for that
- Export LD_LIBRARY_PATH to fix older releases build
- There is no actual reason to not use github tag for tarball
  fetching -> remove the service
- Format with spec-cleaner
- Use proper %cmake macros everywhere
- Add configure options for cmake to set it up in a way we really
  want
- Add patch from Debian to not install OpenCL cmake finder:
  * cmake-no-install-ocl-cmake.patch
- enabled tests 
- packaged separate benchnn packae with its input files
- updated to v1.1.3 which includes
 * Fixed the mean and variance memory descriptors in layer 
   normalization (65f1908)
 * Fixed the layer normalization formula (c176ceb)
- updated to v1.1.2 
  * Fixed threading over the spatial in bfloat16 batched
    normalization (017b6c9)
  * Fixed read past end-of-buffer error for int8 convolution (7d6f45e)
  * Fixed condition for dispatching optimized channel blocking in 
    fp32 backward convolution on Intel Xeon Phi(TM) processor (846eba1)
  * Fixed fp32 backward convolution for shapes with spatial strides 
    over the depth dimension (002e3ab)
  * Fixed softmax with zero sizes on GPU (936bff4)
  * Fixed int8 deconvolution with dilation when ih <= dh (3e3bacb)
  * Enabled back fp32 -> u8 reorder for RNN (a2c2507)
  * Fixed segmentation fault in bfloat16 backward convolution from 
    kd_padding=0 computation (52d476c)
  * Fixed segmentation fault in bfloat16 forward convolution due 
    to push/pop imbalance (4f6e3d5)
  * Fixed library version for OS X build (0d85005)
  * Fixed padding by channels in concat (a265c7d)
  * Added full text of third party licenses and 
    copyright notices to LICENSE file (79f204c)
  * Added separate README for binary packages (28f4c96)
  * Fixed computing per-oc mask in RNN (ff3ffab)
  * Added workaround for number of cores calculation in Xbyak (301b088)
- added ARCH_OPT_FLAGS=""
- Initial checking of the Intel(R) Math Kernel Library for 
  Deep Neural Networks which can be used by:
  * tensorflow
  * Caffee
  * PyTorch
  and other machine learning tools
buildservice-autocommit accepted request 1137488 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 27)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 1135202 from Alessandro de Oliveira Faria's avatar Alessandro de Oliveira Faria (cabelo) (revision 26)
- Update to 3.3.3:
- This is a patch release containing the following changes to v3.3.2:
  * Fixed performance regression in int8 convolutions on processors with Intel AVX-512 and Intel DL Boost support (a00661f)
  * Fixed race condition during library initialization on Intel Data Center GPU Max Series (7dfcd11)
  * Fixed accuracy issue in experimental Graph Compiler with LLVM code generator (8892e7e)
  * Disabled int8 RNN implementation for cases with non-trivial strides (2195e4b)
  * Fixed incorrect results in bfloat16 convolution implementation on processors with Intel AMX support (9f00af9)
  * Fixed incorrect results in fp16 and int8 convolution on Intel Core Ultra integrated GPUs (69cef84, 79bc6cc, c9c0b09)
buildservice-autocommit accepted request 1130165 from Christian Goll's avatar Christian Goll (mslacken) (revision 25)
baserev update by copy to link target
Christian Goll's avatar Christian Goll (mslacken) accepted request 1130129 from Alessandro de Oliveira Faria's avatar Alessandro de Oliveira Faria (cabelo) (revision 24)
- Update to 3.3.1:
- This is a patch release containing the following changes to v3.3:
  * Fixed int8 convolution accuracy issue on Intel GPUs (09c87c7)
  * Switched internal stream to in-order mode for NVIDIA and AMD GPUs to avoid synchronization issues (db01d62)
  * Fixed runtime error for avgpool_bwd operation in Graph API (d025ef6, 9e0602a, e0dc1b3)
  * Fixed benchdnn error reporting for some Graph API cases (98dc9db)
  * Fixed accuracy issue in experimental Graph Compiler for int8 MHA variant from StarCoder model (5476ef7)
  * Fixed incorrect results for layer normalization with trivial dimensions on Intel GPUs (a2ec0a0)
  * Removed redundant synchronization for out-of-order SYCL queues (a96e9b1)
  * Fixed runtime error in experimental Graph Compiler for int8 MLP subgraph from LLAMA model (595543d)
  * Fixed SEGFAULT in experimental Graph Compiler for fp32 MLP subgraph (4207105)
  * Fixed incorrect results in experimental Graph Compiler for MLP subgraph (57e14b5)
  * Fixed the issue with f16 inner product primitive with s8 output returning unimplemented on Intel GPUs (bf12207, 800b5e9, ec7054a)
  * Fixed incorrect results for int8 deconvolution with zero-points on processors with Intel AMX instructions support (55d2cec)
buildservice-autocommit accepted request 1116623 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 23)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 1116604 from Paolo Stivanin's avatar Paolo Stivanin (polslinux) (revision 22)
- Update to 3.3:
  * 3.3: https://github.com/oneapi-src/oneDNN/releases/tag/v3.3
  * 3.2: https://github.com/oneapi-src/oneDNN/releases/tag/v3.2
  * 3.1: https://github.com/oneapi-src/oneDNN/releases/tag/v3.1
- Drop upstreamed onednn-fix-gcc13.patch
buildservice-autocommit accepted request 1073553 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 21)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 1073552 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 20)
- Update to 3.0.1:
  * Changes: https://github.com/oneapi-src/oneDNN/releases/tag/v3.0.1
- Skipped 3.0:
  * Changes: https://github.com/oneapi-src/oneDNN/releases/tag/v3.0
- Add patch to fix build with GCC13:
  * onednn-fix-gcc13.patch
- Disable Arm Compute library support until fixed upstream
  https://github.com/oneapi-src/oneDNN/issues/1599
- Drop upstream patches:
  * 1428.patch
  * fa93750.patch
buildservice-autocommit accepted request 1005197 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 19)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 1005196 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 18)
- Add patch to fix build with latest Arm Compute Library:
  * 1428.patch
  * fa93750.patch (dep for 1428.patch)
buildservice-autocommit accepted request 1003336 from Christian Goll's avatar Christian Goll (mslacken) (revision 17)
baserev update by copy to link target
Christian Goll's avatar Christian Goll (mslacken) accepted request 1003326 from Paolo Stivanin's avatar Paolo Stivanin (polslinux) (revision 16)
- Update to 2.6.2:
  * https://github.com/oneapi-src/oneDNN/releases
- Removed onednn-1045.patch.
- Removed onednn-xbyak-aarch64.patch.
buildservice-autocommit accepted request 900181 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 15)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 900180 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 14)
- Fix build on aarch64:
  * onednn-xbyak-aarch64.patch
buildservice-autocommit accepted request 900151 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 13)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 900115 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 12)
- Update to version 2.2.4:
  * Fixed build error with GCC 11 (eda1add)
  * Fixed an issue with reorder reporting unimplemented when
    quantizing f32 weights to s8 (4f05b76, 5d3d1e1, cc77eef)
  * Updated name for GPU gen12 architecture to xe (3d202c2)
- Drop upstream patch:
  * 0001-common-gpu-include-thread-and-limit-headers-to-fix-G.patch
buildservice-autocommit accepted request 897415 from Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) (revision 11)
baserev update by copy to link target
Guillaume GARDET's avatar Guillaume GARDET (Guillaume_G) accepted request 897397 from Ferdinand Thiessen's avatar Ferdinand Thiessen (susnux) (revision 10)
Update to version 2.2.3
Displaying revisions 1 - 20 of 29
openSUSE Build Service is sponsored by