Skip to content

Latest commit

 

History

History
195 lines (149 loc) · 5.74 KB

CONTRIBUTING.md

File metadata and controls

195 lines (149 loc) · 5.74 KB

General Guidance for Contributors

Development

Clone the repository and install the local package including all dependencies within a virtual environment:

$ git clone https://github.com/AI-SDC/SACRO-ML.git
$ cd SACRO-ML
$ pip install .[test]

Then to run the tests:

$ pytest .

Directory Structure

  • sacroml Contains the sacroml source code.
    • attacks Contains a variety of privacy attacks on machine learning models.
    • preprocessing Contains preprocessing modules for test datasets.
    • safemodel The safemodel wrappers for common machine learning models.
  • docs Contains Sphinx documentation files.
  • examples Contains examples of how to run the code contained in this repository.
  • tests Contains unit tests.
  • user_stories Contains user guides.

Documentation

Documentation is hosted here: https://ai-sdc.github.io/SACRO-ML/

Style Guide

Python code should be linted with pylint.

A pre-commit configuration file is provided to automatically:

  • Trim trailing whitespace and fix line endings;
  • Check for spelling errors with codespell;
  • Check and format JSON files;
  • Format Python and notebooks with black;
  • Upgrade Python syntax with pyupgrade;
  • Automatically remove unused Python imports;
  • Sort Python imports.

Pre-commit can be setup as follows:

$ pip install pre-commit

Then to run on all files in the repository:

$ pre-commit run -a

To install as a hook that executes with every git commit:

$ pre-commit install

Automatic Documentation

The documentation is automatically built using Sphinx and github actions.

The source files in docs/source are parsed/compiled into HTML files in docs/_build. The contents of docs/_build is pushed to the gh-pages branch which is then automatically deployed to the github.io site.

The main configuration file is docs/source/conf.py Most commonly the path variable will pick up any source to document occasionally directories might need adding top the path. Please ensure to use abspath()

Sphinx reads the docstrings in the Python source.

It uses the numpydoc format. Your code should be documented with numpydoc comments. NumpyDoc.

Quick Start

Need to get your documentation into the generated docs? If your DocStrings are in the right format, this method should work for most cases:

  1. Go to docs/source
  2. Make a copy of an rst file, e.g., safedecisiontree.rst
  3. Edit the new file and change the title and automodule line.
Data Interface
==============

An example Python Notebook is available  `Here <https://github.com/jim-smith/GRAIMatter/blob/main/WP2/wrapper/wrapper-concept.ipynb>`__

.. automodule:: preprocessing.loaders
   :members:
  1. Save the new file
  2. Edit the index.rst and insert the new filename (without the .rst) into the correct position in the list.
.. toctree::
   :maxdepth: 2
   :caption: Contents:

   introduction
   attacks
   safemodel
   safedecisiontree
   saferandomforest
   safekeras
   datainterface
  1. Save index.rst
  2. Push your updates to main

DocStrings

An example docstring from the safemodel source is below:

class SafeModel:
      """Privacy protected model base class.
      Attributes
      ----------
      model_type: string
            A string describing the type of model. Default is "None".
      model:
            The Machine Learning Model.
      saved_model:
            A saved copy of the Machine Learning Model used for comparison.
      ignore_items: list
            A list of items to ignore when comparing the model with the
            saved_model.
      examine_separately_items: list
            A list of items to examine separately. These items are more
            complex datastructures that cannot be compared directly.
      filename: string
            A filename to save the model.
      researcher: string
            The researcher user-id used for logging
      Notes
      -----
      Examples
      --------
      >>> safeRFModel = SafeRandomForestClassifier()
      >>> safeRFModel.fit(X, y)
      >>> safeRFModel.save(name="safe.pkl")
      >>> safeRFModel.preliminary_check()
      >>> safeRFModel.request_release(filename="safe.pkl")
      WARNING: model parameters may present a disclosure risk:
      - parameter min_samples_leaf = 1 identified as less than the recommended min value of 5.
      Changed parameter min_samples_leaf = 5.
      Model parameters are within recommended ranges.
      """

Static and Generated Content

The .rst files in docs/source/ are a mixture of static and generated content Static content should be written in ReStructuredText (.rst) format.

A short primer

Restructured Text Primer

Automatic code documentation from the docstrings uses sphinx directives in the .rst files like this:

.. automodule:: safemodel.classifiers.safedecisiontreeclassifier
   :members:

Images

It is possible to include images like this

.. image:: stars.jpg
    :width: 200px
    :align: center
    :height: 100px
    :alt: alternate text

Generating docs locally

It is useful to be able to generate your docs locally (to check for bugs etc.)

On GNU/Linux, navigate to the docs folder and then issue the command make html

On Windows, navigate to the docs folder than then issue the command sphinx-build source _build

The generated html will be in the folder docs/_build and can be opened in any browser.