Skip to content

uibk-uncover/sealwatch

Repository files navigation

PyPI version Commit CI/CD Release CI/CD Documentation Status PyPI downloads Stars Contributors Wheel Status

Last commit

sealwatch

Python package, containing implementations of modern image steganalysis algorithms.

⚠️ This project is under intensive development as we speak.

Installation

Simply install the package with pip3

pip3 install sealwatch

or using the cloned repository

git clone https://github.com/uibk-uncover/sealwatch/
cd sealwatch
pip3 install .

Import to Python by typing

import sealwatch as sw

Contents

Abbreviation Dimensionality Domain Reference Output format
SPAM: subtractive pixel adjacency matrix 686 spatial Reference ordered dict
JRM: JPEG rich model 11255 JPEG Reference ordered dict
CC-JRM: cartesian-calibrated JRM 22510 JPEG Reference ordered dict
DCTR: discrete cosine transform residual features 8000 spatial Reference ordered dict
PHARM: phase-aware projection rich model 12600 JPEG Reference ordered dict
GFR: Gabor filter residual features 17000 JPEG Reference 5D array
SRM: spatial rich models 34671 spatial Reference ordered dict
SRMQ1: SRM with quantization 1 12753 spatial Reference ordered dict
CRM: color rich models 5404 spatial Reference ordered dict

These implementations are based on the Matlab reference implementations provided by the DDE lab at Binghamton University.

Usage

Extract GFR features from a single JPEG image

features = sw.gfr.extract("seal1.jpg")

After having extracted features from cover and stego images, you can train an FLD ensemble as binary classifier.

import numpy as np

Xc_tr, Xs_tr, Xc_te, Xs_te = sw.ensemble_classifier.helpers.load_and_split_features(
    cover_features_filename="cover_features.h5",
    stego_features_filename="stego_features.h5",
    train_csv="train.csv",
    test_csv="test.csv",
)

# Training is faster when arrays are C-contiguous
Xc_tr = np.ascontiguousarray(Xc_tr)
Xs_tr = np.ascontiguousarray(Xs_tr)

# The hyper-parameter search is wrapped inside a trainer class
trainer = sw.ensemble_classifier.FldEnsembleTrainer(
    Xc=Xc_tr,
    Xs=Xs_tr,
    seed=12345,
    verbose=1,
)

# Train with hyper-parameter search
trained_ensemble, training_records = trainer.train()

# Concatenate the test features and labels
X_test = np.concatenate((cover_features_test, stego_features_test), axis=0)
y_test = np.concatenate((
    -np.ones(len(cover_features_test)),
    +np.ones(len(stego_features_test))
), axis=0)

# Calculate test accuracy
test_accuracy = trained_ensemble.score(X_test, y_test)

Feature formats

Note that the feature extractors return different formats: 1D arrays, multi-dimensional arrays, or ordered dicts. The reason is that feature descriptors are composed of multiple submodels. Retaining the structure allows the user to select a specific submodel. The following snippets show how to flatten the features to a 1D array.

Ordered dict

# The PHARM feature extraction returns an ordered dict
features_grouped = sw.pharm.extract_from_file("seal1.jpg", implementation=sw.PHARM_REVISITED)

# Flatten dict to a 1D array
features = sw.tools.flatten(features_grouped)

Multi-dimensional array

# The GFR feature extraction returns a 5-dimensional array:
# - Dimension 0: Phase shifts
# - Dimension 1: Scales
# - Dimension 2: Rotations/Orientations
# - Dimension 3: Number of histograms
# - Dimension 4: Co-occurrences
features = sw.gfr.extract("seal1.jpg")

# Simply flatten to a 1D array
features = features.flatten()

Acknowledgements and Disclaimer

Developed by Martin Benes and Benedikt Lorch, University of Innsbruck, 2024.

The implementations of feature extractors and the detector in this package are based on the original Matlab code provided by the Digital Data Embedding Lab at Binghamton University.

We have made our best effort to ensure that our implementations produce identical results as the original Matlab implementations. However, it is the user's responsibility to verify this. For notes on compatibility with previous implementation, see compatibility.md.