Skip to content
This repository has been archived by the owner on Oct 21, 2024. It is now read-only.

Commit

Permalink
FIX: add line for check duplicates_save_path is None
Browse files Browse the repository at this point in the history
  • Loading branch information
p-idx authored and 41ow1ives committed May 9, 2024
1 parent 757e87e commit a0adedc
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion dataverse/etl/deduplication/minhash.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,10 @@ def deduplication___minhash___lsh_jaccard(
elif isinstance(data, DataFrame):
data_df = data

if os.path.exists(duplicates_save_path):
if (
duplicates_save_path is not None
and os.path.exists(duplicates_save_path)
):
assert "duplicates_save_path already exists."

temp_id_col, component_col, tokens_col, ngrams_col = \
Expand Down

0 comments on commit a0adedc

Please sign in to comment.