Running the Data Discovery samples
Use the information in this section to run the Data Discovery samples provided in the data-discovery/samples folder. These samples demonstrate how to use the Data Discovery API for classification and redaction of sensitive information in text and tabular data.
Running Data Discovery
The example scripts under the data-discovery/ folder demonstrate classification and redaction using the Data Discovery v2 API. For more information about the Data Discovery APIs, refer to the section Data Discovery APIs.
Note: A dedicated
data-discovery/docker-compose.ymlis provided to start only the Data Discovery service.
Open a command prompt.
Navigate to the directory where AI Developer Edition is cloned.
Launch data-discovery services. Refer to the docker compose setup page to know how to set up the package.
Run any of the example scripts from the
data-discovery/directory:Classification - text input
python data-discovery/samples/python/sample-classification-python-text.py bash data-discovery/samples/bash/sample-classification-bash-text.shClassification - tabular (CSV) input
python data-discovery/samples/python/sample-classification-python-tabular.py bash data-discovery/samples/bash/sample-classification-bash-tabular.shRedaction
python data-discovery/samples/python/sample-redaction-python.py bash data-discovery/samples/bash/sample-redaction-bash.shView the output of the files processed on the screen. The output displays the classification labels or redacted text returned by the Data Discovery service.
Using Notebooks for Classifying and Redacting unstructured documents
The notebook demonstrates how to use the Data Discovery API with Python’s requests library to classify and redact sensitive information in unstructured text and tabular data. It submits sample data containing sensitive information to a local Data Discovery service for classification. It also shows how the Transform API replaces detected PII entities with standardized labels, for example, [PERSON] or [SOCIAL_SECURITY_ID].
Make sure you have the Jupyter notebook installed in your system.
Navigate to the directory where AI Developer Edition is cloned.
Run the following command to start Jupyter Lab.
jupyter labCopy the URL displayed and navigate to the site from a web browser. Ensure that
localhostis replaced with the IP address of the system where the AI Developer Edition is set up.Open the example at:
data-discovery/samples/jupyter/sample-classification-jupyter-text.ipynbdata-discovery/samples/jupyter/sample-classification-jupyter-tabular.ipynbdata-discovery/samples/jupyter/sample-redaction-jupyter-text.ipynb
Run all cells and see the results of the execution interactively.
Feedback
Was this page helpful?