Skip to content

davhbrown/interactive_classification_metrics

Repository files navigation

Interactive classification metrics

Get an intuitive sense for the ROC curve and other binary classification metrics with interactive visualization.

example animation

This is a teaching and understanding tool. Change the statistics of the normal distributions or the classification threshold to see how it affects different classification metrics. Read the paper or blog post for more information.

* Matthew's Correlation Coefficient (MCC) represented as unit-normalized MCC as in Cao et al. 2020.

Install & Run

From PyPI

Create a dedicated python environment (recommended).

python3 -m pip install interactive-classification-metrics

Run with Bokeh server locally from the command line:

run-app

Opens a web browser where you can use the application.

By cloning the repo

  1. Clone this repo git clone https://github.com/davhbrown/interactive_classification_metrics.git
  2. cd interactive_classification_metrics
  3. Create a dedicated python environment is recommended
  4. pip install -r requirements.txt

Run with Bokeh server locally from the command line:

bokeh serve --show serve.py

Opens a web browser where you can use the application.

Paper

Brown DH, Chicco D (2024) Interactive Classification Metrics: A graphical application to build robust intuition for classification model evaluation. arXiv:2412.17066

https://doi.org/10.48550/arXiv.2412.17066

Inspired by

  • Cao C, Chicco D, Hoffman MM (2020) The MCC-F1 curve: a performance evaluation technique for binary classification. https://doi.org/10.48550/arXiv.2006.11278
  • arthurcgusmao, the author of py-mcc-f1 used here
  • Chicco D, Tötsch N, Jurman G (2021) The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Mining 14:13, 1-22.
  • Chicco D, Jurman G (2023) The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Mining 16:4, 1-23.
  • the spirit of this tweet

Acknowledgments

Special thanks to Dr. Davide Chicco (@davidechicco) for feedback and collaboration on this project.

Cite as

@misc{brown2024icm,
      title={Interactive Classification Metrics: A graphical application to build robust intuition for classification model evaluation},
      author={David H. Brown and Davide Chicco},
      year={2024},
      eprint={2412.17066},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2412.17066},
}