This is the code repository for XGBoost for Regression Predictive Modeling and Time Series Analysis, published by Packt.
Learn how to build, evaluate, and deploy predictive models with expert guidance
XGBoost offers a powerful solution for regression and time series analysis, enabling you to build accurate and efficient predictive models. In this book, the authors draw on their combined experience of 40+ years in the semiconductor industry to help you harness the full potential of XGBoost, from understanding its core concepts to implementing real-world applications.
This book covers the following exciting features:
- Build a strong, intuitive understanding of the XGBoost algorithm and its benefits
- Implement XGBoost using the Python API for practical applications
- Evaluate model performance using appropriate metrics
- Deploy XGBoost models into production environments
- Handle complex datasets and extract valuable insights
- Gain practical experience in feature engineering, feature selection, and categorical encoding
If you feel this book is for you, get your copy today!
All of the code is organized into folders.
The code will look like the following:
import seaborn as sns
sns.displot( irisdata, x="Species",
discrete = True, hue="Species",
shrink =0.8, palette="Greys" )
Following is what you need for this book: This book is for data scientists, machine learning practitioners, analysts, and professionals interested in predictive modeling and time series analysis. Basic coding knowledge and familiarity with Python, GitHub, and other DevOps tools are required.
With the following software and hardware list you can run all code files present in the book (Chapter 1-13).
Chapter | Software required | OS required |
---|---|---|
1-13 | Python 3.8+ Jupyter Notebook | Windows, Mac OS X, and Linux (Any) |
1-13 | Jupyter Notebook | Windows, Mac OS X, and Linux (Any) |
Partha Pritam Deka is a data science leader with 15+ years of experience in semiconductor supply chain and manufacturing. As a senior staff engineer at Intel, he has led AI and machine learning teams, achieving significant cost savings and optimizations. He and his team developed a computer vision system that improved Intel’s logistics, earning CSCMP Innovation Award finalist recognition. An active AI community member, Partha is a senior IEEE member and speaker at Intel’s AI Everywhere conference. He also reviews for NeurIPS, contributing to AI and analytics in semiconductor manufacturing.
Joyce Weiner is a principal engineer with Intel Corporation. She has over 25 years of experience in the semiconductor industry, having worked in fabrication, assembly and testing, and design. Since the early 2000s, she has deployed applications that use machine learning. Joyce is a black belt in Lean Six Sigma and her area of technical expertise is the application of data science to improve efficiency. She has a BS in Physics from Rensselaer Polytechnic Institute and an MS in Optical Sciences from the University of Arizona.
Welcome to the XGBoost for Regression Predictive Modeling and Time Series Analysis repository. This repository contains practical code examples, data, and resources to accompany the book on utilizing XGBoost for regression and time series forecasting tasks. The book explores foundational to advanced techniques, providing readers with tools to understand, develop, and deploy XGBoost models effectively.
This repository includes code and supplementary resources for the chapters in the book, covering topics such as:
- Fundamentals of XGBoost and its application in predictive modeling
- Techniques for time series forecasting with XGBoost
- Feature engineering, interpretability, and model deployment
- Practical examples using real-world datasets for hands-on learning
This repository serves as a companion to the book, helping practitioners and researchers deepen their understanding of XGBoost.
To get started, clone this repository and install the dependencies listed in the xgboostenv.yml
file.
git clone https://github.com/your-username/XGBoost-for-Regression-Predictive-Modeling-and-Time-Series-Analysis.git
cd XGBoost-for-Regression-Predictive-Modeling-and-Time-Series-Analysis
Use the provided Conda environment file to set up the required dependencies.
conda env create -f xgboostenv.yml
conda activate xgboostenv
This environment will install all necessary libraries, including XGBoost, scikit-learn, and other dependencies required to run the code in this repository.
Please note that Chapter 10 has additional or specific dependencies that require a separate environment setup. Follow the instructions in the ch10
folder's README or setup script to create and activate this specialized environment.
Each chapter directory (ch2
, ch4
, etc.) corresponds to a chapter in the book and contains the relevant code and datasets.
ch2
,ch4
,ch5
, ... : Chapter directories with practical examples and exercisesch10
: Contains additional setup instructions specific to Chapter 10, please refer to Chapter10_Environment_Setup_Instructions.txt file for setting up the python enviromentxgboostenv.yml
: Conda environment file for installing dependencies for most chaptersLICENSE
: License information for the repository
Each chapter's code can be executed individually. Here’s how to get started:
-
Navigate to the desired chapter folder:
cd ch2
-
Run the scripts or Jupyter notebooks within the folder to follow along with the book’s examples.
For a quick example of running the code in Chapter 2:
-
Open the Jupyter Notebook in
ch2
:jupyter notebook ch2/notebook_name.ipynb
-
Execute the cells to see the results and experiment with the parameters.
- Partha Pritam Deka
- Joyce Weiner
Contributions to enhance and expand this repository are welcome. Please submit a pull request with a description of your changes.
This repository is licensed under the terms specified in the LICENSE
file. Please refer to it for more information.
Happy modeling! For further details and explanations, refer to the book accompanying this repository.