Request 1173003 (accepted)

Overview

Request 1173003 accepted

- Fix sample source path in build script.
- Update to 2024.1.0
- More Generative AI coverage and framework integrations to
minimize code changes.
* Mixtral and URLNet models optimized for performance
improvements on Intel® Xeon® processors.
* Stable Diffusion 1.5, ChatGLM3-6B, and Qwen-7B models
optimized for improved inference speed on Intel® Core™
Ultra processors with integrated GPU.
* Support for Falcon-7B-Instruct, a GenAI Large Language Model
(LLM) ready-to-use chat/instruct model with superior
performance metrics.
* New Jupyter Notebooks added: YOLO V9, YOLO V8
Oriented Bounding Boxes Detection (OOB), Stable Diffusion
in Keras, MobileCLIP, RMBG-v1.4 Background Removal, Magika,
TripoSR, AnimateAnyone, LLaVA-Next, and RAG system with
OpenVINO and LangChain.
- Broader Large Language Model (LLM) support and more model
compression techniques.
* LLM compilation time reduced through additional optimizations
with compressed embedding. Improved 1st token performance of
LLMs on 4th and 5th generations of Intel® Xeon® processors
with Intel® Advanced Matrix Extensions (Intel® AMX).
* Better LLM compression and improved performance with oneDNN,
INT4, and INT8 support for Intel® Arc™ GPUs.
* Significant memory reduction for select smaller GenAI
models on Intel® Core™ Ultra processors with integrated GPU.
- More portability and performance to run AI at the edge,
in the cloud, or locally.
* The preview NPU plugin for Intel® Core™ Ultra processors
is now available in the OpenVINO open-source GitHub
repository, in addition to the main OpenVINO package on PyPI.
* The JavaScript API is now more easily accessible through
the npm repository, enabling JavaScript developers’ seamless
access to the OpenVINO API.
* FP16 inference on ARM processors now enabled for the
Convolutional Neural Network (CNN) by default.
- Support Change and Deprecation Notices
* Using deprecated features and components is not advised. They
are available to enable a smooth transition to new solutions
and will be discontinued in the future. To keep using
Discontinued features, you will have to revert to the last
LTS OpenVINO version supporting them.
* For more details, refer to the OpenVINO Legacy Features
and Components page.
* Discontinued in 2024.0:
+ Runtime components:
- Intel® Gaussian & Neural Accelerator (Intel® GNA).
Consider using the Neural Processing Unit (NPU)
for low-powered systems like Intel® Core™ Ultra or
14th generation and beyond.
- OpenVINO C++/C/Python 1.0 APIs (see 2023.3 API
transition guide for reference).
- All ONNX Frontend legacy API (known as
ONNX_IMPORTER_API)
- 'PerfomanceMode.UNDEFINED' property as part of
the OpenVINO Python API
+ Tools:
- Deployment Manager. See installation and deployment
guides for current distribution options.
- Accuracy Checker.
- Post-Training Optimization Tool (POT). Neural Network
Compression Framework (NNCF) should be used instead.
- A Git patch for NNCF integration with 
huggingface/transformers. The recommended approach
 is to use huggingface/optimum-intel for applying
NNCF optimization on top of models from Hugging
Face.
- Support for Apache MXNet, Caffe, and Kaldi model
formats. Conversion to ONNX may be used as
a solution.
* Deprecated and to be removed in the future:
+ The OpenVINO™ Development Tools package (pip install
openvino-dev) will be removed from installation options
and distribution channels beginning with OpenVINO 2025.0.
+ Model Optimizer will be discontinued with OpenVINO 2025.0.
Consider using the new conversion methods instead. For
more details, see the model conversion transition guide.
+ OpenVINO property Affinity API will be discontinued with
OpenVINO 2025.0. It will be replaced with CPU binding
configurations (ov::hint::enable_cpu_pinning).
+ OpenVINO Model Server components:
- “auto shape” and “auto batch size” (reshaping a model
in runtime) will be removed in the future. OpenVINO’s
dynamic shape models are recommended instead.