diff --git a/sigpro/sigpro_pipeline_demo.ipynb b/sigpro/sigpro_pipeline_demo.ipynb
new file mode 100644
index 0000000..e50e8dc
--- /dev/null
+++ b/sigpro/sigpro_pipeline_demo.ipynb
@@ -0,0 +1,485 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "559d158c",
+ "metadata": {},
+ "source": [
+ "# Processing Signals with Pipelines\n",
+ "\n",
+ "Now that we have identified and/or generated several primitives for our signal feature generation, we would like to define a reusable *pipeline* for doing so. \n",
+ "\n",
+ "First, let's import the required libraries and functions.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "29d47100",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sigpro\n",
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "from matplotlib import pyplot as plt\n",
+ "from sigpro.demo import _load_demo as get_demo"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "85e23607",
+ "metadata": {},
+ "source": [
+ "\n",
+ "## Defining Primitives\n",
+ "\n",
+ "Recall that we can obtain the list of available primitives with the `get_primitives` method:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "fd8e7afe",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['sigpro.SigPro',\n",
+ " 'sigpro.aggregations.amplitude.statistical.crest_factor',\n",
+ " 'sigpro.aggregations.amplitude.statistical.kurtosis',\n",
+ " 'sigpro.aggregations.amplitude.statistical.mean',\n",
+ " 'sigpro.aggregations.amplitude.statistical.rms',\n",
+ " 'sigpro.aggregations.amplitude.statistical.skew',\n",
+ " 'sigpro.aggregations.amplitude.statistical.std',\n",
+ " 'sigpro.aggregations.amplitude.statistical.var',\n",
+ " 'sigpro.aggregations.frequency.band.band_mean',\n",
+ " 'sigpro.transformations.amplitude.identity.identity',\n",
+ " 'sigpro.transformations.amplitude.spectrum.power_spectrum',\n",
+ " 'sigpro.transformations.frequency.band.frequency_band',\n",
+ " 'sigpro.transformations.frequency.fft.fft',\n",
+ " 'sigpro.transformations.frequency.fft.fft_real',\n",
+ " 'sigpro.transformations.frequency_time.stft.stft',\n",
+ " 'sigpro.transformations.frequency_time.stft.stft_real']"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "from sigpro import get_primitives\n",
+ "\n",
+ "get_primitives()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e7a5d87d",
+ "metadata": {},
+ "source": [
+ "In addition, we can also define our own custom primitives.\n",
+ "\n",
+ "## Building a Pipeline\n",
+ "\n",
+ "Let’s go ahead and define a feature processing pipeline that sequentially applies the `identity`and `fft` transformations before applying the `std` aggregation. To pass these primitives into the signal processor, we must write each primitive as a dictionary with the following fields:\n",
+ "\n",
+ "- `name`: Name of the transformation / aggregation.\n",
+ "- `primitive`: Name of the primitive to apply.\n",
+ "- `init_params`: Dictionary containing the initializing parameters for the primitive. *\n",
+ "\n",
+ "Since we choose not to specify any initial parameters, we do not set `init_params` in these dictionaries."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "748893d2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "identity_transform = {'name': 'identity1',\n",
+ " 'primitive': 'sigpro.transformations.amplitude.identity.identity'}\n",
+ "\n",
+ "fft_transform = {'name': 'fft1',\n",
+ " 'primitive': 'sigpro.transformations.frequency.fft.fft'}\n",
+ "\n",
+ "std_agg = {'name': 'std1',\n",
+ " 'primitive': \"sigpro.aggregations.amplitude.statistical.std\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bb97a448",
+ "metadata": {},
+ "source": [
+ "\n",
+ "We now define a new pipeline containing the primitives we would like to apply. At minimum, we will need to pass in a list of transformations and a list of aggregations; the full list of available arguments is given below.\n",
+ "\n",
+ "- Inputs:\n",
+ " - `transformations (list)` : List of dictionaries containing the transformation primitives.\n",
+ " - `aggregations (list)`: List of dictionaries containing the aggregation primitives.\n",
+ " - `values_column_name (str)`(optional):The name of the column that contains the signal values. Defaults to `'values'`.\n",
+ " - `keep_columns (Union[bool, list])` (optional): Whether to keep non-feature columns in the output DataFrame or not. If a list of column names are passed, those columns are kept. Defaults to `False`.\n",
+ " - `input_is_dataframe (bool)` (optional): Whether the input is a pandas Dataframe. Defaults to `True`.\n",
+ "\n",
+ "Returning to the example:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "755c6442",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "transformations = [identity_transform, fft_transform]\n",
+ "\n",
+ "aggregations = [std_agg]\n",
+ "\n",
+ "mypipeline = sigpro.SigPro(transformations, aggregations, values_column_name = 'yvalues', keep_columns = True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1a69f6b7",
+ "metadata": {},
+ "source": [
+ "\n",
+ "SigPro will proceed to build an `MLPipeline` that can be reused to build features.\n",
+ "\n",
+ "To check that `mypipeline` was defined correctly, we can check the input and output arguments with the `get_input_args` and `get_output_args` methods."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "efea71e3",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[{'name': 'readings', 'keyword': 'data', 'type': 'pandas.DataFrame'}, {'name': 'feature_columns', 'default': None, 'type': 'list'}]\n",
+ "[{'name': 'readings', 'type': 'pandas.DataFrame'}, {'name': 'feature_columns', 'type': 'list'}]\n"
+ ]
+ }
+ ],
+ "source": [
+ "input_args = mypipeline.get_input_args()\n",
+ "output_args = mypipeline.get_output_args()\n",
+ "\n",
+ "print(input_args)\n",
+ "print(output_args)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7de8fe01",
+ "metadata": {},
+ "source": [
+ "## Applying a Pipeline with `process_signal`\n",
+ "\n",
+ "Once our pipeline is correctly defined, we apply the `process_signal` method to a demo dataset. Recall that `process_signal` is defined as follows:\n",
+ "\n",
+ "\n",
+ "```python\n",
+ "def process_signal(self, data=None, window=None, time_index=None, groupby_index=None,\n",
+ " feature_columns=None, **kwargs):\n",
+ "\n",
+ "\t\t...\n",
+ "\t\treturn data, feature_columns\n",
+ "```\n",
+ "\n",
+ "`process_signal` accepts as input the following arguments:\n",
+ "\n",
+ "- `data (pd.Dataframe)` : Dataframe with a column containing signal values.\n",
+ "- `window (str)`: Duration of window size, e.g. ('1h').\n",
+ "- `time_index (str)`: Name of column in `data` that represents the time index.\n",
+ "- `groupby_index (str or list[str])`: List of column names to group together and take the window over.\n",
+ "- `feature_columns (list)`: List of columns from the input data that should be considered as features (and not dropped).\n",
+ "\n",
+ "`process_signal` outputs the following:\n",
+ "\n",
+ "- `data (pd.Dataframe)`: Dataframe containing output feature values as constructed from the signal\n",
+ "- `feature_columns (list)`: list of (generated) feature names.\n",
+ "\n",
+ "We now apply our pipeline to a toy dataset. We define our toy dataset as follows: "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "05a2d02e",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " turbine_id | \n",
+ " signal_id | \n",
+ " xvalues | \n",
+ " yvalues | \n",
+ " sampling_frequency | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 00:00:00 | \n",
+ " [0.43616983763682876, -0.17662312586241055, 0.... | \n",
+ " 1000 | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 01:00:00 | \n",
+ " [0.8023828754411122, -0.14122063493312714, -0.... | \n",
+ " 1000 | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 02:00:00 | \n",
+ " [-1.3143142430046044, -1.1055740033788437, -0.... | \n",
+ " 1000 | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 03:00:00 | \n",
+ " [-0.45981995520032104, -0.3255426061995603, -0... | \n",
+ " 1000 | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 04:00:00 | \n",
+ " [-0.6380405111460377, -0.11924167777027689, 0.... | \n",
+ " 1000 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " turbine_id signal_id xvalues \\\n",
+ "0 T001 Sensor1_signal1 2020-01-01 00:00:00 \n",
+ "1 T001 Sensor1_signal1 2020-01-01 01:00:00 \n",
+ "2 T001 Sensor1_signal1 2020-01-01 02:00:00 \n",
+ "3 T001 Sensor1_signal1 2020-01-01 03:00:00 \n",
+ "4 T001 Sensor1_signal1 2020-01-01 04:00:00 \n",
+ "\n",
+ " yvalues sampling_frequency \n",
+ "0 [0.43616983763682876, -0.17662312586241055, 0.... 1000 \n",
+ "1 [0.8023828754411122, -0.14122063493312714, -0.... 1000 \n",
+ "2 [-1.3143142430046044, -1.1055740033788437, -0.... 1000 \n",
+ "3 [-0.45981995520032104, -0.3255426061995603, -0... 1000 \n",
+ "4 [-0.6380405111460377, -0.11924167777027689, 0.... 1000 "
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "demo_dataset = get_demo()\n",
+ "demo_dataset.columns = ['turbine_id', 'signal_id', 'xvalues', 'yvalues', 'sampling_frequency']\n",
+ "demo_dataset.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "74f2f337",
+ "metadata": {},
+ "source": [
+ "Finally, we apply the `process_signal` method of our previously defined pipeline:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "b97344a2",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " turbine_id | \n",
+ " signal_id | \n",
+ " xvalues | \n",
+ " yvalues | \n",
+ " sampling_frequency | \n",
+ " identity1.fft1.std1.std_value | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 00:00:00 | \n",
+ " [0.43616983763682876, -0.17662312586241055, 0.... | \n",
+ " 1000 | \n",
+ " 14.444991 | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 01:00:00 | \n",
+ " [0.8023828754411122, -0.14122063493312714, -0.... | \n",
+ " 1000 | \n",
+ " 12.326223 | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 02:00:00 | \n",
+ " [-1.3143142430046044, -1.1055740033788437, -0.... | \n",
+ " 1000 | \n",
+ " 12.051415 | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 03:00:00 | \n",
+ " [-0.45981995520032104, -0.3255426061995603, -0... | \n",
+ " 1000 | \n",
+ " 10.657243 | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " T001 | \n",
+ " Sensor1_signal1 | \n",
+ " 2020-01-01 04:00:00 | \n",
+ " [-0.6380405111460377, -0.11924167777027689, 0.... | \n",
+ " 1000 | \n",
+ " 12.640728 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " turbine_id signal_id xvalues \\\n",
+ "0 T001 Sensor1_signal1 2020-01-01 00:00:00 \n",
+ "1 T001 Sensor1_signal1 2020-01-01 01:00:00 \n",
+ "2 T001 Sensor1_signal1 2020-01-01 02:00:00 \n",
+ "3 T001 Sensor1_signal1 2020-01-01 03:00:00 \n",
+ "4 T001 Sensor1_signal1 2020-01-01 04:00:00 \n",
+ "\n",
+ " yvalues sampling_frequency \\\n",
+ "0 [0.43616983763682876, -0.17662312586241055, 0.... 1000 \n",
+ "1 [0.8023828754411122, -0.14122063493312714, -0.... 1000 \n",
+ "2 [-1.3143142430046044, -1.1055740033788437, -0.... 1000 \n",
+ "3 [-0.45981995520032104, -0.3255426061995603, -0... 1000 \n",
+ "4 [-0.6380405111460377, -0.11924167777027689, 0.... 1000 \n",
+ "\n",
+ " identity1.fft1.std1.std_value \n",
+ "0 14.444991 \n",
+ "1 12.326223 \n",
+ "2 12.051415 \n",
+ "3 10.657243 \n",
+ "4 12.640728 "
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "processed_data, feature_columns = mypipeline.process_signal(demo_dataset, time_index = 'xvalues')\n",
+ "\n",
+ "processed_data.head()\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f04273a4",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Success! We have managed to apply the primitives to generate features on the input dataset.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d804c3d2",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.13"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}