awesome-data-quality Awesome tools and resources for dealing with rogue data π Readings and Resources: Machine-learning on dirty data in Python: a tutorial (GaΓ«l Varoquaux) Dirty data science: Machine learning on noncurated data (GaΓ«l Varoquaux) The Quartz guide to bad data Tools: dirty_cat: machine learning on dirty categories Fuzzy matching/identity resolution: dedupe.io Python Record Linkage Toolkit fuzzymatcher