-
Notifications
You must be signed in to change notification settings - Fork 1
API Reference
Simple PII detection where you can pass in text and have a boolean returned confirming if PII was detected along with the returned redacted text. Please note that this endpoint has some further optimization needed.
curl -X POST -H "Content-Type: application/json" -d '{"text": "My name is John Doe, and my email is [email protected]"}' http://127.0.0.1:5000/api/detect_pii
The /api/process_csv
endpoint allows users to pass in a csv and specify whether they wish to:
- identify and redact information using one of three methods:
- Apply simple insertion of “[REDACT]”
- Anonymized values
- Hash
- substitute in synthetic data
The output can be downloaded in the same format as the original file.
curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.csv" -F "redaction_method=hash" http://127.0.0.1:6000/api/process_csv --output redacted_output.csv
curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.csv" -F "synthetic_data_generation=true" http://127.0.0.1:6000/api/process_csv --output redacted_output.csv
Similar to /api/process_csv
, you can use the /api/process_excel
endpoint to upload an Excel file (.xls, .xlsx) and specify a redaction method to remove sensitive data from the file. The redacted file can be downloaded in the same format as the original file. The endpoint also supports synthetic data generation to replace sensitive data with realistic but fake data.
curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.xls" -F "redaction_method=hash" "http://127.0.0.1:5000/api/process_excel" --output redacted_output.xlsx
curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.xlsx" "http://127.0.0.1:6000/api/process_excel" -F "synthetic_data_generation=true" --output redacted_output.xlsx
The /api/process_json
endpoint allows users to upload a JSON file and specify a redaction method to remove sensitive data from the file. The redacted file can be downloaded in the same format as the original file. The endpoint also supports synthetic data generation.
curl -X POST -H "Content-Type: multipart/form-data" -F "file=@/Path/to/your.json" -F "redaction_method=hash" http://localhost:5000/api/process_json -o redacted_output.json
Coming soon!
The request body for all three endpoints contains the following parameters:
-
file
: The file to be redacted. The file must be of the appropriate type for the endpoint being used (CSV, Excel, or JSON). -
redaction_method
: The method to be used for redacting sensitive data from the file. The following methods are available:-
fixed_string
: Replace sensitive data with a fixed string. -
random_value
: Replace sensitive data with a random value. -
hash
: Replace sensitive data with a hash of the data.
-
-
synthetic_data_generation
: Whether to generate synthetic data to replace sensitive data in the file. This parameter is optional and defaults tofalse
.
The response body for all three endpoints contains the redacted file in the same format as the original file.
The API returns the following status codes:
-
200
: The request was successful. -
400
: The request was malformed or invalid.