Intel® oneAPI Data Analytics Library 2021.3
The release introduces the following changes:
📚 Support Materials
The following additional materials were created:
-
Medium blogs:
- Superior Machine Learning Performance on the Latest Intel Xeon Scalable Processors
- Leverage Intel Optimizations in Scikit-Learn (SVM Performance Training and Inference)
- Optimizing CatBoost Performance
- Performance Optimizations for End-to-End AI Pipelines
- Optimizing the End-to-End Training Pipeline on Apache Spark Clusters
-
Kaggle kernels:
- [Tabular Playground Series - Apr 2021] RF with Intel Extension for Scikit-learn
- [Tabular Playground Series - Apr 2021] SVM with Intel Extension for Scikit-learn
- [Tabular Playground Series - Apr 2021] SVM with scikit-learn-intelex
-
Samples that illustrate the usage of Intel Extension for Scikit-learn
🛠️ Library Engineering
- Introduced a new Python package, Intel® Extension for Scikit-learn*. The scikit-learn-intelex package contains scikit-learn patching functionality that was originally available in daal4py package. All future updates for the patches will be available only in Intel® Extension for Scikit-learn. We recommend using scikit-learn-intelex package instead of daal4py.
- Download the extension using one of the following commands:
pip install scikit-learn-intelex
conda install scikit-learn-intelex -c conda-forge
- Enable Scikit-learn patching:
from sklearnex import patch_sklearn
patch_sklearn()
- Download the extension using one of the following commands:
- Introduced optional dependencies on DPC++ runtime to daal4py. To enable DPC++ backend, install dpcpp_cpp_rt package. It reduces the default package size with all dependencies from 1.2GB to 400 MB.
- Added the support of building oneDAL-based applications with /MD and /MDd options on Windows. The -d suffix is used in the names of oneDAL libraries that are built with debug run-time (/MDd).
🚨 What's New
Introduced new oneDAL and daal4py functionality:
- CPU:
- SVM Regression algorithm
- NuSVM algorithm for both Classification and Regression tasks
- Polynomial kernel support for all SVM algorithms (SVC, SVR, NuSVC, NuSVR)
- Minkowski and Chebyshev distances for kNN Brute-force
- The brute-force method and the voting mode support for kNN algorithm in oneDAL interfaces
- Multiclass support for SVM algorithms in oneDAL interfaces
- CSR-matrix support for SVM algorithms in oneDAL interfaces
- Subgraph Isomorphism algorithm technical preview
- Single Source Shortest Path (SSSP) algorithm technical preview
Improved oneDAL and daal4py performance for the following algorithms:
- CPU:
- Support Vector Machines training and prediction
- Linear, Ridge, ElasticNet, and LASSO regressions prediction
- GPU:
- Decision Forest training and prediction
- Principal Components Analysis training
Introduced the support of scikit-learn 1.0 version in Intel Extension for Scikit-learn.
- The 2021.3 release of Intel Extension for Scikit-learn supports the latest scikit-learn releases: 0.22.X, 0.23.X, 0.24.X and 1.0.X.
Introduced new functionality for Intel Extension for Scikit-learn:
- General:
- The support of
patch_sklearn
for all algorithms
- The support of
- CPU:
- Acceleration of SVR estimator
- Acceleration of NuSVC and NuSVR estimators
- Polynomial kernel support in SVM algorithms
Improved the performance of the following scikit-learn estimators via scikit-learn patching:
- SVM algorithms training and prediction
- Linear, Ridge, ElasticNet, and Lasso regressions prediction
Fixed the following issues:
- General:
- Fixed binary incompatibility for the versions of numpy earlier than 1.19.4
- Fixed an issue with a very large number of trees (> 7000) for Random Forest algorithm.
- Fixed
patch_sklearn
to patch both fit and predict methods of Logistic Regression when the algorithm is given as a single parameter topatch_sklearn
- CPU:
- Improved numerical stability of training for Alternating Least Squares (ALS) and Linear and Ridge regressions with Normal Equations method
- Reduced the memory consumption of SVM prediction
- GPU:
- Fixed an issue with kernel compilation on the platforms without hardware FP64 support
❗ Known Issues
- Intel® Extension for Scikit-learn and daal4py packages installed from PyPI repository can’t be found on Debian systems (including Google Collab). Mitigation: add “site-packages” folder into Python packages searching before importing the packages:
import sys
import os
import site
sys.path.append(os.path.join(os.path.dirname(site.getsitepackages()[0]), "site-packages"))