Skip to content

(ml) - multi touch attribution (dnn based) pipeline

License

Notifications You must be signed in to change notification settings

py-zhai/DNAMTA

 
 

Repository files navigation

DeepAttribution

GitHub repo size GitHub contributors GitHub stars GitHub forks

DeepAttribution is an AWS (sagemaker) ML pipeline that allows marketing data scientists and ML engineers to compute multi touch-attribution results using the state of the art technique without effort.

Prerequistes

  • A big dataset (> 1GB) at impression level (impressions dataset) as parquet file with the following schema:

    • uid: the user/client unique identifier
    • timestamp: unix timestamp
    • campaign: the campaign name
    • conversion: whether or not a conversion happened after the impression
                    uid | timestamp | campaign | conversion 
                    _______________________________________

                    int |    int    |    str   |    bool 
  • S3 bucket (deep-attribution bucket) with the following folder hierarchy
    .
    ├── raw                         # contains impressions dataset
    ├── feature_store               # empty before pipeline execution
    ├── feature_store_preprocessed  # empty before pipeline execution
    ├── model                       # empty before pipeline execution
    └── attention_report            # empty before pipeline execution

Using DeepAttribution

  1. Define the journey maximum length. Please refer to this doc to define it.
  2. Update the config file (config.yaml) with the desired instance type and count, bucket name and the journey maximum length.
  3. In the deep-attribution instance open the pipeline execution notebook (deep_attribution/pipeline_exec.ipynb)
  4. Run all the cells
  5. Get the attribution results in the deep-attribution bucket (attention_report/campaign_attention.parquet)

Contact

If you want to contact me you can reach me at [email protected].

License

This project uses the following license: MIT.

About

(ml) - multi touch attribution (dnn based) pipeline

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.0%
  • Jupyter Notebook 5.0%