You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think I found an issue if you set the probationary_period for KNNCAD to be too low.
This was tripping me up a little so thought worth raising in here. I'm not quite sure what the solution would be - maybe some sort of reasonable default for probationary_period in KNNCAD could help others at least avoid this in future.
Or maybe its just fine and people should not set such a low probationary_period but it was one of the first things i did so maybe others might too :)
Reproducible example:
# Import modules.
from sklearn.utils import shuffle
from pysad.evaluation import AUROCMetric
from pysad.models import xStream, RobustRandomCutForest, KNNCAD
from pysad.utils import ArrayStreamer
from pysad.transform.postprocessing import RunningAveragePostprocessor
from pysad.transform.preprocessing import InstanceUnitNormScaler
from pysad.utils import Data
from tqdm import tqdm
import numpy as np
# This example demonstrates the usage of the most modules in PySAD framework.
if __name__ == "__main__":
np.random.seed(61) # Fix random seed.
# Get data to stream.
data = Data("data")
X_all, y_all = data.get_data("arrhythmia.mat")
X_all, y_all = shuffle(X_all, y_all)
iterator = ArrayStreamer(shuffle=False) # Init streamer to simulate streaming data.
model = KNNCAD(probationary_period=10)
#model = RobustRandomCutForest()
#model = xStream() # Init xStream anomaly detection model.
preprocessor = InstanceUnitNormScaler() # Init normalizer.
postprocessor = RunningAveragePostprocessor(window_size=5) # Init running average postprocessor.
auroc = AUROCMetric() # Init area under receiver-operating- characteristics curve metric.
for X, y in tqdm(iterator.iter(X_all[100:], y_all[100:])): # Stream data.
X = preprocessor.fit_transform_partial(X) # Fit preprocessor to and transform the instance.
score = model.fit_score_partial(X) # Fit model to and score the instance.
score = postprocessor.fit_transform_partial(score) # Apply running averaging to the score.
auroc.update(y, score) # Update AUROC metric.
# Output resulting AUROCS metric.
print("\nAUROC: ", auroc.get())
Gives error:
/usr/local/lib/python3.6/dist-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.utils.testing module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.utils. Anything that cannot be imported from sklearn.utils is now part of the private API.
warnings.warn(message, FutureWarning)
0it [00:00, ?it/s]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-3-c8fd98afee64> in <module>()
31 X = preprocessor.fit_transform_partial(X) # Fit preprocessor to and transform the instance.
32
---> 33 score = model.fit_score_partial(X) # Fit model to and score the instance.
34 score = postprocessor.fit_transform_partial(score) # Apply running averaging to the score.
35
1 frames
/usr/local/lib/python3.6/dist-packages/pysad/models/knn_cad.py in fit_partial(self, X, y)
73 self.training.append(self.calibration.pop(0))
74
---> 75 self.scores.pop(0)
76 self.calibration.append(new_item)
77 self.scores.append(new_score)
IndexError: pop from empty list
If i set the probationary_period to 25 i see a slightly different error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-fb6b7ffc5fde> in <module>()
31 X = preprocessor.fit_transform_partial(X) # Fit preprocessor to and transform the instance.
32
---> 33 score = model.fit_score_partial(X) # Fit model to and score the instance.
34 score = postprocessor.fit_transform_partial(score) # Apply running averaging to the score.
35
4 frames
<__array_function__ internals> in partition(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py in partition(a, kth, axis, kind, order)
744 else:
745 a = asanyarray(a).copy(order="K")
--> 746 a.partition(kth, axis=axis, kind=kind, order=order)
747 return a
748
ValueError: kth(=28) out of bounds (6)
Then if I set probationary_period=50 it works.
So feels like is some sort of edge case I may be hitting when probationary_period is low.
I'm happy to work on a PR if some sort of easy fix we can make or even just want to set a default that might avoid people doing what I did :)
The text was updated successfully, but these errors were encountered:
I think I found an issue if you set the
probationary_period
for KNNCAD to be too low.This was tripping me up a little so thought worth raising in here. I'm not quite sure what the solution would be - maybe some sort of reasonable default for
probationary_period
in KNNCAD could help others at least avoid this in future.Or maybe its just fine and people should not set such a low probationary_period but it was one of the first things i did so maybe others might too :)
Reproducible example:
Gives error:
If i set the
probationary_period
to 25 i see a slightly different error:Then if I set
probationary_period=50
it works.So feels like is some sort of edge case I may be hitting when
probationary_period
is low.I'm happy to work on a PR if some sort of easy fix we can make or even just want to set a default that might avoid people doing what I did :)
The text was updated successfully, but these errors were encountered: