This is the multi-page printable view of this section. Click here to print.
AI Developer Edition
- 1: Introduction to Protegrity AI Developer Edition
- 2: AI Developer Edition Architecture
- 3: Setting up AI Developer Edition
- 3.1: Prerequisites
- 3.2: Optional - Obtaining access to the AI Developer Edition API Service
- 3.3: Setting up the packages
- 3.4: Verifying the files in the Protegrity AI Developer Edition package
- 4: Running the sample application
- 5: Customizing the sample application
- 6: Building the Python module
- 7: Appendix
1 - Introduction to Protegrity AI Developer Edition
Protegrity AI Developer Edition is a lightweight, containerized sandbox. It lets developers and data scientists quickly prototype, test, and integrate data protection and discovery into their workflows. It does not require setting up a complex infrastructure and managing its operational overhead.
It is a self-contained, Docker-based environment designed to help developers, data scientists, and architects quickly explore and prototype data protection and discovery workflows. It enables a user to have a hands-on experimentation without the need for enterprise infrastructure. With a modular architecture, built-in sample data, and a developer-first experience, AI Developer Edition is ideal for evaluating Protegrity’s capabilities in a fast, flexible, and frictionless way.
What is Protegrity AI Developer Edition?
Protegrity AI Developer Edition is designed to help a developer move quickly from idea to implementation, using familiar tools, sample apps, and open APIs.
It provides a streamlined environment to:
- Discover and redact sensitive data using APIs and sample apps.
- Discover and protect sensitive data using APIs and sample apps.
- Perform message and conversation level risk scoring.
- Scan Personally identifiable information (PII) for GenAI flows.
- Provide a streamlined environment to test real world usecases with sample datasets and guided walkthroughs.
AI Developer Edition runs entirely on Docker, making it easy to spin up, tear down, and iterate quickly. It helps the user build a proof of concept, validate integration points, and get familiar with Protegrity’s core concepts. This edition provides the tools to set up the product fast and independently.
This product is not meant for production use, but it is the perfect launchpad for innovation.
Key Features
AI Developer Edition is purpose-built for fast, frictionless exploration of Protegrity’s core capabilities.
The following features make it ideal for prototyping and integration:
Modular, Containerized Architecture: AI Developer Edition runs on Docker, making it easy to test, isolate, and iterate.
Sample Apps and Data: Jumpstart evaluation with ready-to-run sample apps that demonstrate real-world use cases, such as finding sensitive data in unstructured text or finding and redacting sensitive data.
Python Module: This version includes an open-source Python module to use Protegrity in the development environment.
Lightweight: No Enterprise Security Administrator (ESA). No orchestration overhead. Just deploy the container and use the sample application.
Data Discovery: This container identifies, classifies, masks, redacts, or protects sensitive data. It uses built-in and custom classifiers to detect sensitive data with confidence scoring.
Semantic Guardrails: This container is used to analyze conversational data and apply privacy and appropriateness filters. This feature helps enforce content boundaries and detect PII using Protegrity’s Data Discovery engine.
AI Developer Edition API Service: A service hosted by Protegrity that allows developers to interact with Protegrity’s protection and discovery services through intuitive endpoints. It supports protection and unprotection of sensitive data, enabling rapid prototyping and testing of data protection scenarios without needing full-scale infrastructure. Registration is required for this service. The credentials can be obtained for free.
This product is continuously improving. The features mentioned here are either already available or will be available shortly.
Protegrity AI Developer Edition Personas
The primary personas who benefit most from AI Developer Edition.
| Persona | Role Description | Goals | Typical Activities |
|---|---|---|---|
| Application Developer | Builds and integrates applications that handle sensitive data. | - Embed protection APIs. - Prototype quickly. - Validate integration points. | - Run sample apps. |
| Data Scientist or ML Engineers | Works with sensitive datasets in analytics and machine learning workflows. | - Discover and classify PII. - Protect training data. - Ensure compliance. | - Use discovery APIs. - Integrate with Jupyter notebooks. - Test module. |
| Solution Architect | Designs end-to-end data protection strategies across systems and teams. | - Evaluate platform fit. - Define architecture. - Guide implementation. | - Review sample apps. - Test modular deployment. - Assess performance. |
| Security or Privacy Lead | Ensures data protection aligns with compliance and governance requirements. | - Understand protection methods. - Validate policy behavior. - Review audit paths. | - Inspect logs. - Simulate policy scenarios. - Review discovery results. |
Use Cases
A range of use cases across both Data Protection, Security, and emerging GenAI-driven applications are supported.
Data Protection and Security Use Cases
These use cases focus on helping developers and data scientists secure sensitive data in conventional applications, services, and pipelines.
| Use Case | Description |
|---|---|
| Find and Redact | Discover sensitive data using Data Discovery API and redact or mask them. |
| Find and Protect | Discover sensitive data using Data Discovery API and tokenize protect them. |
| Sample App Prototyping | Use prebuilt apps to simulate real-world scenarios like protecting PII unstructured text. Helps accelerate evaluation and integration. |
| Python Module Integration | Integrate protection APIs into Python using lightweight modules. Useful for embedding Protegrity into existing development pipelines. |
| API Evaluation | Directly test protection and discovery APIs using tools like Postman or curl. Enables low-friction exploration of Protegrity’s core capabilities. |
GenAI Use Cases
AI Developer Edition supports emerging GenAI workflows where sensitive data may be used in prompts, training datasets, or inference pipelines. These use cases help developers and data scientists ensure privacy and compliance when working with large language models (LLMs) and AI-driven applications.
The Semantic Guardrail feature and samples are provided with the Develper Edition. The use cases listed here are potential applications that users can develop using the feature.
| Use Case | Description |
|---|---|
| Chatbot Input Protection | Protect sensitive user inputs, such as names, emails, IDs, before passing them to GenAI models. Ensures privacy compliance in conversational AI workflows. |
| Prompt Sanitization | Automatically detect and mask PII in prompts used for LLM-based applications. Helps reduce risk in prompt engineering and inference. |
| Training Data Anonymization | Discover and redact sensitive fields in datasets used to train GenAI models. Supports responsible AI development practices. |
| Notebook-Based Experimentation | Use Jupyter notebooks to test protection and discovery workflows in GenAI pipelines. Ideal for data scientists working with unstructured or semi-structured data. |
These use cases are especially relevant for teams building AI-powered tools that interact with real-world user data, where privacy and data protection are critical.
2 - AI Developer Edition Architecture
A high-level architecture of AI Developer Edition is provided in the following image.

This release of AI Developer Edition consists of sample applications that utilizes and showcases the capabilities of Data Discovery, Semantic Guardrail, protection, and unprotection using simple Python modules. The Data Discovery component is used for identifying sensitive data. After identification, the Python module redacts, masks, or protects the sensitive information. Protection is done using the AI Developer Edition API Service.
Data Discovery
Data Discovery is a powerful, developer-friendly product designed specifically to address this challenge.
For more information, refer to the Data Discovery documentation.
Overview
Data Discovery Text Classification service advances data discovery and classification, specializing in the detection of Personally Identifiable Information (PII), Protected Health Information (PHI), Payment Card Information (PCI) within plain text and free-text inputs. Unlike traditional structured data tools, it excels in dynamic, unstructured environments such as chatbot conversations, call transcripts, and Generative AI (GenAI) outputs.
Architecture
Data Discovery consists of three containers that are hosted on Docker, the Classification container, the Presidio provider container, and similarly, the RoBERTa provider container. The general architecture is illustrated in the following figure.

| Component | Description |
|---|---|
| 1 | The user enters the data to be classified for sensitive data as text body and sends the request to the Classification service. |
| 2 | This Classification service then distributes the request to the Presidio and RoBERTa service providers to process the data. |
| 3 | The Presidio and RoBERTa providers process the data based on their logic and classify them in the form of a response to the Classification service. |
| 4 | The Classification service then aggregates the responses from the service providers and sends it to the user. |
Semantic Guardrail
Protegrity’s GenAI Security - Semantic Guardrail solution is a security guardrail engine for AI systems. It evaluates risks in GenAI chatbots, workflows, and agents through advanced semantic analytics and intent classification to detect potentially malicious messages. PII detection can also be leveraged for comprehensive security coverage.
For more information, refer to the Semantic Guardrail documentation.
Overview
The current implementation is trained on synthetic customer-service AI chatbot datasets. The system performs best when analyzing conversations expected to match the training domain, that is, English-language based customer service interactions involving orders, tickets, and purchases.
For domain-specific and user-specific applications requiring high detection accuracy, fine-tuning is necessary to completely leverage the model’s ability. This helps the model to learn from expected conversation patterns and message structures in both the inputs and outputs of protected GenAI systems.
The system operates by analyzing conversations between participants. These participants are users and AI systems, such as LLMs, agents, or contextual information sources. Furthermore, the system leverages Protegrity’s Data Discovery, if present in the same network environment, to leverage PII detection in its internal decision algorithm.
The solution provides individual message risk scores and classifications, and cumulative conversation risk scores and classifications. This dual-scoring approach ensures that while individual messages may appear benign, potentially risky cumulative conversation patterns are identified. This significantly enhances detection of sophisticated attack vectors, including LLM jailbreaks and prompt injection attempts.
Architecture
The diagram shows how client applications integrate with Semantic Guardrail, and how Data Discovery PII can be integrated as a PII detector provider.

| Component | Description |
|---|---|
| External AI System | AI system, such as AI chatbot or Agent, that responds to a user, using LLM and data, which is integrated with the Semantic Guardrail solution. |
| External LLM | LLM employed as reasoning engine by the external AI system. |
| External Data Sources | Data sources used by an external AI system. |
| Semantic Guardrail | The core application operates as a containerized Docker service. It processes conversation data through HTTP requests and performs comprehensive security risk analysis, applying guardrails including Semantic Guardrail. |
| Data Discovery | For PII detection capabilities, Semantic Guardrail can leverage Protegrity’s Data Discovery solution. This solution operates as specialized Docker containers within the same environment. |
AI Developer Edition API Service
Protegrity AI Developer Edition API Service features functionality derived from the original suite of Protegrity products in a form of API calls. The API endpoints are easy-to-use and require minimal configuration. Registration is required to send API requests to the service for protecting and unprotecting data. A set of predefined users and roles are provided. Based on the role used, the different scenarios can be tried and tested.
Sample Applications
Protegrity AI Developer Edition provides Python modules that showcase the features of Protegrity products.
sample-app-find module
The sample-app-find module is a Python library that process and identifies sensitive data.
The module can be customized to do the following functions:
- Specify a file name and output location for the source data. Only raw file formats are supported for Data Discovery. Multipart formats are not supported; only binary files are accepted.
sample-app-find-and-redact module
The sample-app-find-and-redact module is a Python library that process the identified data and redacts or masks the information.
The module can be customized to do the following functions:
- Specify the items that must be identified.
- Specify the operation to be performed on the data, which is redact or mask.
- Specify a file name and output location for the source data. Only raw file formats are supported for Data Discovery. Multipart formats are not supported; only binary files are accepted.
- Specify a file name and output location for the transformed data.
sample-guardrail-python module
The sample-guardrail-python module is a Python library that submits a request to Semantic Guardrail for analysis.
The module can be customized to do the following functions:
- Specify the data that must be processed.
- Specify the operation that must be performed, that is,
semanticprocessor for messages andpiiprocessor for AI.
sample-app-find-and-protect module
The sample-app-find-and-protect module is a Python library that process the identified data and protects the information. Calls are made to the AI Developer Edition API Service for performing tokenization.
The module can be customized to do the following functions:
- Specify the items that must be identified.
- Specify a file name and output location for the source data. Only raw file formats are supported for Data Discovery. Multipart formats are not supported; only binary files are accepted.
- Specify a file name and output location for the transformed data.
sample-app-find-and-unprotect module
The sample-app-find-and-unprotect module is a Python library that unprotects the information protected by the sample-app-find-and-protect module. Calls are made to the AI Developer Edition API Service for performing detokenization.
The module can be customized to do the following functions:
- Specify a file name and output location for the source data. Only data protected by the
sample-app-find-and-protectmodule can be unprotected. - Specify a file name and output location for the transformed data.
sample-app-protection module
The sample-app-protection module is a Python library that protects and unprotects data. Calls are made to the AI Developer Edition API Service for performing tokenization. The Data Discovery and Semantic Guardrail containers are not required to be running for the sample-app-protection module.
The module can be customized to do the following functions:
- Specify the items that must be protected, data element name, and user.
- Specify the operation that must be performed, protect and unprotect.
3 - Setting up AI Developer Edition
Complete the prerequisites, optionally register for access to AI Developer Edition API Service, set up, verify, and run the required files for using Protegrity AI Developer Edition.
3.1 - Prerequisites
AP Python
Hardware requirements
For the local docker deployment mode, a machine with the following specifications will enable you to experiment with the main features:
- RAM: 16 GB
- CPU: 8 core
- Hard Disk: 30GB available
For the local docker deployment mode, a machine with the following specifications will enable you to experiment with the main features:
- RAM: 16 GB
- CPU: 8 core
- Hard Disk: 30GB available
For the local docker deployment mode, a machine with the following specifications will enable you to experiment with the main features:
- RAM: 16 GB
- CPU: 4 core
- Hard Disk: 30GB available
Software requirements
- Python v3.12.11 and above is installed. For more information about installing Python, refer to the Python website. Ensure that the Python command points to a supported python3 version, for example, Python 3.12.11. Verify using the
python --versioncommand. - pip for installing packages.
- Python Virtual Environment.
- Docker CLI is installed to manage Docker containers.
- Docker Compose is installed for local containerized deployments. This application supports Docker Compose V2. Ensure that your installation supports this version.
- Git is installed for cloning the repository.
- Python v3.12.11 and above is installed. For more information about installing Python, refer to the Python website. Ensure that the Python command points to a supported python3 version, for example, Python 3.12. Verify using the
python --versioncommand. - pip for installing packages.
- Python Virtual Environment.
- Docker CLI is installed to manage Docker containers.
- Docker Compose is installed for local containerized deployments. This application supports Docker Compose V2. Ensure that your installation supports this version.
- Git is installed for cloning the repository.
- Python v3.12.11 and above is installed. For more information about installing Python, refer to the Python website. Ensure that the Python command points to a supported python3 version, for example, Python 3.12. Verify using the
python --versioncommand. - pip for installing packages.
- Python Virtual Environment.
- Docker Desktop or Colima is installed.
- Docker Compose is installed for local containerized deployments. This application supports Docker Compose V2. Ensure that your installation supports this version.
- Git is installed for cloning the repository.
Additional settings for macOS
macOS requires additional steps for Docker and for systems with Apple Silicon chips. Complete the following steps before using AI Developer Edition.
Complete one of the following options to apply the settings.
- For Colima:
- Open a command prompt.
- Run the following command.
colima start --vm-type vz --vz-rosetta --memory 4
- For Docker Desktop:
- Open Docker Desktop.
- Go to Settings > General.
- Enable the following check boxes:
- Use Virtualization framework
- Use Rosetta for x86_64/amd64 emulation on Apple Silicon
- Click Apply & restart.
- For Colima:
Update one of the following options for resolving certificate related errors.
- For Colima:
Open a command prompt.
Navigate and open the following file.
~/.colima/default/colima.yamlUpdate the following configuration in
colima.yamlto add the path for obtaining the required images.Before update:
docker: {}After update:
docker: insecure-registries: - ghcr.ioSave and close the file.
Stop colima.
colima stopClose and start the command prompt.
Start colima.
colima start --vm-type vz --vz-rosetta --memory 4
- For Docker Desktop:
Open Docker Desktop.
Click the gear or settings icon.
Click Docker Engine from the sidebar. The editor opens the current Docker daemon configuration
daemon.json.Locate and add the
insecure-registrieskey in the root JSON object. Ensure that a comma is added after the last value in the existing configuration.After update:
{ . . <existing configuration>, "insecure-registries": [ "ghcr.io", "githubusercontent.com" ] }Click Apply & Restart to save the changes and restart Docker Desktop.
Verify: After Docker restarts, run
docker infoin your terminal and confirm that the required registry is listed under Insecure Registries.
- For Colima:
Optional: If the The requested image’s platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested error is displayed.
Start a command prompt.
Navigate and open the following file.
~/.docker/config.jsonAdd the following paramater.
"default-platform": "linux/amd64"Save and close the file.
Run
docker compose up -dfrom theprotegrity-developer-editiondirectory if already cloned, else continue with the setup.
3.2 - Optional - Obtaining access to the AI Developer Edition API Service
Registration is only required for running the APIs to protect, unprotect, and reprotect data. The find and redact that uses Data Discovery and Semantic Guardrail features can be used without registration. Skip this section if find and protect that uses the tokenization and encryption feature is not required.
Registering for access
Sign up for access to the AI Developer Edition API Service. This is required for obtaining access to use the APIs.
- Open a web browser.
- Navigate to https://www.protegrity.com/developers/get-api-credentials.
- Specify the following details:
- First Name
- Last Name
- Work Email
- Job Title
- Company Name
- Country
- Click the Terms & Conditions link and read the terms and conditions.
- Select the check box to accept the terms and conditions.
- Click Get Started.
The request is analyzed. After the request is approved, a password and API key to access the AI Developer Edition API Service is sent to the Work Email specified. If the account already exists, then the details are re-sent to the email address. The email takes a minute or two to arrive. If you do not see the email in your inbox, check your spam or junk folder before retrying.
Specifying the authentication information
Add the login information provided by Protegrity to the environment to access the AI Developer Edition API Service.
It is recommended to add the details to the environment variables to avoid specifying the information every time the environment is initialized.
- Open a command prompt.
- Initialize a Python virtual environment.
- Add the email address of the user.
export DEV_EDITION_EMAIL='<Email_used_for_registration>'
$env:DEV_EDITION_EMAIL = '<Email_used_for_registration>'
export DEV_EDITION_EMAIL='<Email_used_for_registration>'
- Specify the password provided in the registration email.
export DEV_EDITION_PASSWORD='<Password_provided_in_email>'
$env:DEV_EDITION_PASSWORD = '<Password_provided_in_email>'
export DEV_EDITION_PASSWORD='<Password_provided_in_email>'
- Specify the API key for accessing the AI Developer Edition API Service.
export DEV_EDITION_API_KEY='<API_key_provided_in_email>'
$env:DEV_EDITION_API_KEY = '<API_key_provided_in_email>'
export DEV_EDITION_API_KEY='<API_key_provided_in_email>'
- Verify that the variables are set.
test -n "$DEV_EDITION_EMAIL" && echo "EMAIL $DEV_EDITION_EMAIL set" || echo "EMAIL missing"
test -n "$DEV_EDITION_PASSWORD" && echo "PASSWORD $DEV_EDITION_PASSWORD set" || echo "PASSWORD missing"
test -n "$DEV_EDITION_API_KEY" && echo "API KEY $DEV_EDITION_API_KEY set" || echo "API KEY missing"
if ($env:DEV_EDITION_EMAIL) { Write-Output "EMAIL $env:DEV_EDITION_EMAIL set"} else { Write-Output "EMAIL missing"}
if ($env:DEV_EDITION_PASSWORD) { Write-Output "PASSWORD $env:DEV_EDITION_PASSWORD set" } else { Write-Output "PASSWORD missing" }
if ($env:DEV_EDITION_API_KEY) { Write-Output "API KEY $env:DEV_EDITION_API_KEY set" } else { Write-Output "API KEY missing" }
test -n "$DEV_EDITION_EMAIL" && echo "EMAIL $DEV_EDITION_EMAIL set" || echo "EMAIL missing"
test -n "$DEV_EDITION_PASSWORD" && echo "PASSWORD $DEV_EDITION_PASSWORD set" || echo "PASSWORD missing"
test -n "$DEV_EDITION_API_KEY" && echo "API KEY $DEV_EDITION_API_KEY set" || echo "API KEY missing"
AI Developer Edition API Service usage guidelines
To ensure fair use of the API service, rate limits is enforced on API requests to the AI Developer Edition API Service.
These limits are:
- Request rate: 50 per second
- Burst: up to 100
- Quota: 10,000 requests per user per day
- Maximum payload size: 1MB
3.3 - Setting up the packages
Obtaining the package
Navigate to the Protegrity AI Developer Edition repository.
Clone or download the repositories on your local system.
- protegrity-developer-edition: Contains the files to launch the required containers. It also contains the sample applications and files.
git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-edition.gitTo customize the Python modules, clone and use the source from the protegrity-developer-python repository.
git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-python.gitVerify the files in the package. The list of files in the git package can be obtained from the files list.
Back up the Protegrity AI Developer Edition repository if the Python and configuration files are updated.
Navigate to the cloned repository location for protegrity-developer-edition.
Run the following command to stop the containers.
docker compose downBased on your configuration use the
docker-compose downcommand.Sync to update the repositories on the local system using the
git pullcommand.- protegrity-developer-edition: Contains the files to launch the required containers. It also contains the sample applications and files.
- protegrity-developer-python: Contains the source files for customizing and using the Python module.
Verify the files in the package. The list of files in the git package can be obtained from the files list.
Setting up Data Discovery and Semantic Guardrail
The containers contain the Data Discovery and Semantic Guardrail components required for identifying sensitive data.
Open a command prompt.
Navigate to the cloned repository location for protegrity-developer-edition.
Run the following command to download and start the containers. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy.
docker compose up -dBased on your configuration use the
docker-compose up -dcommand.Verify that the containers started successfully.
docker compose logs
Open a command prompt.
Navigate to the cloned repository location for protegrity-developer-edition.
If the step to stop containers was missed earlier, then use the following commands to identify and remove the AI Developer Edition containers.
docker compose down docker compose down --remove-orphansDelete the docker network resources.
docker network rm -f <network_name_or_id>For example,
docker network rm -f protegrity-networkRun the following command to download and start the containers. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy.
docker compose up -dBased on your configuration use the
docker-compose up -dcommand.Verify that the containers started successfully.
docker compose logs
Installing the protegrity-developer-python Module
The module has built-in functions to find, redact, mask, and protect data.
Open a command prompt.
Install the protegrity-developer-python module. It is recommended to install and activate the Python virtual environment before running this command.
pip install protegrity-developer-pythonThe installation completes and the success message is displayed. To compile and install the Python module from source, refer to Building the Python module.
Open a command prompt.
Upgrade the protegrity-developer-python module. It is recommended to install and activate the Python virtual environment before running the command.
pip install --upgrade protegrity-developer-pythonThe package is successfully upgraded.
3.4 - Verifying the files in the Protegrity AI Developer Edition package
protegrity-developer-edition repository
The repository for the obtaining and running the sample application.
- docker-compose.yml: This file contains the configuration for deploying the Data Discovery and Semantic Guardrail containers.
- README.md: The readme file specifying the steps to set up the product.
- samples: The directory with the sample application and scripts for the Python module.
- sample-app-find-and-redact.py: The sample application Python file for detecting and redacting sensitive information in the source file.
- sample-app-find-and-protect.py: The sample application Python file for detecting and protecting sensitive information in the source file using tokenization and encryption.
- sample-app-find-and-unprotect.py: The sample application Python file for unprotecting sensitive information in the source file. The source is generated by sample-app-find-and-protect.py.
- sample-app-protection.py: The sample application Python file for protecting and unprotecting data.
- sample-app-find.py: The sample application Python file for detecting and listing sensitive information in the source file.
- config.json: The configuration file for the Python application.
- sample-data: The directory with the sample file.
- input.txt: The sample file that is processed.
- output-redact.txt: The output file created by the find and redact application.
- output-protect.txt: The output file created by the find and protect application.
- data-discovery: The directory with the sample application and scripts for Data Discovery.
- sample-classification-commands.sh: A file with the sample curl command for identifying sensitive data.
- sample-classification-python.py: A sample Python module for identifying sensitive data.
- semantic-guardrail: The directory with the sample application and scripts for Semantic Guardrail.
- sample-guardrail-python.py: A sample Python module for submitting multi-turn conversation with semantic and PII processors.
protegrity-developer-python repository
The repository with the source files for customizing and compiling the Python file.
- LICENSE: The license file with the terms and conditions for using the application.
- README.md: The readme file for working with the Python file.
- pyproject.toml: The configuration file for the script.
- pytest.ini: The configuration file for the Pytest framework.
- requirements.txt: The configuration file for the script.
- appython: The directory for the source file.
- init.py: The initializing script.
- protector.py: The source file for the script.
- protegrity-developer-python: The directory for the source file.
- init.py: The initializing script.
- securefind.py: The source file for the script.
4 - Running the sample application
In the AI Developer Edition, a user uploads a file using the sample application, which is processed by the Data Discovery container. The containers detect sensitive data. A Python module then redacts, masks, or protects and unprotects the data. The sanitized file is saved to a configured location. For more information about the sample application, refer to Sample application.
Use the steps provided here to run the application end-to-end. If required, run the APIs and functions provided for performing specific tasks. For more information about the identification APIs, refer to Data Discovery API.
Running the applications
Applications are provided out-of-the-box to test and understand the capabilities of AI Developer Edition.
Running the sample find application
Open a command prompt.
Navigate to the directory where AI Developer Edition is cloned.
Run the sample application using the following command.
python samples/sample-app-find.pyView the output of the files processed on the screen. The output displays a list of sensitive items in the source file.

- View the processed output file in the output directory.
Running the sample find and redact application
Open a command prompt.
Navigate to the directory where AI Developer Edition is cloned.
Run the sample application using the following command.
python samples/sample-app-find-and-redact.pyView the output of the files processed on the screen. The output displays a list of sensitive items in the source file. It also displays the location and name of the output file with the redacted output.

- View the processed output file in the output directory.
Running Semantic Guardrail
- Open a command prompt.
- Navigate to the directory where AI Developer Edition is cloned.
- Run the following command to test Semantic Guardrail. The following command submits a multi-turn conversation for analysis. One for semantic and a second one for PII processing.
python semantic-guardrail/sample-guardrail-python.py

Running the sample find and protect application for Python
Ensure that the required credentials are obtained and environment variables specified, using the steps from Optional - Obtaining access to the AI Developer Edition API Service.
Open a command prompt.
Navigate to the directory where AI Developer Edition is cloned.
Run the sample application using the following command.
python samples/sample-app-find-and-protect.pyView the output of the files processed on the screen. The output displays the protected data and unprotected data.

View the processed output file in the output directory. The
samples/sample-data/output-protect.txtfile is generated with the protected, that is tokenized-like, values.To obtain the original data, run the following command.
python samples/sample-app-find-and-unprotect.pyThis reads the
samples/sample-data/output-protect.txtfile and produces thesamples/sample-data/output-unprotect.txtfile with original values.

Running the sample find and protect application for Java
Ensure that the required credentials are obtained and environment variables specified, using the steps from Optional - Obtaining access to the AI Developer Edition API Service.
Open a command prompt.
Navigate to the directory where AI Developer Edition is cloned.
Run the sample application using the following command.
java samples/sample-app-find-and-protect.pyView the output of the files processed on the screen. The output displays the protected data and unprotected data.

View the processed output file in the output directory. The
samples/sample-data/output-protect.txtfile is generated with the protected, that is tokenized-like, values.To obtain the original data, run the following command.
java samples/sample-app-find-and-unprotect.pyThis reads the
samples/sample-data/output-protect.txtfile and produces thesamples/sample-data/output-unprotect.txtfile with original values.

Running the script for protecting data
The sample-app-protection.py showcases the various scenarios for protecting, unprotecting, and reprotecting data.
Understanding Users and Roles
The users and roles are built-in for impersonate testing. Leverage any of the preconfigured users to showcase Protegrity’s Role-Based Access Controls. Using a different user will result in distinct views over sensitive data. Some users will only be able to protect data but will not be able to reverse the operation. Some users will only be able to re-identify selected attributes.
To use any of the roles, simply pass the chosen value to the payload in the user attribute during the protect or unprotect operation. If the user is not specified, the request will default to superuser.
The following roles and users have been configured and are available for use:
| Role | User | Description |
|---|---|---|
| ADMIN | admin, devops, jay.banerjee | The role can protect all data but cannot unprotect. Upon an unprotection attempt they will be displayed protected values. |
| FINANCE | finance, robin.goodwill | The role can unprotect all PII and PCI data. The role cannot protect any data. When attempting to unprotect data without authorization, they will be displayed nulls. |
| MARKETING | marketing , merlin.ishida | The role can unprotect some PII data that is required for analytical research and campaign outreach. When attempting to unprotect data without authorization, they will be displayed nulls. The role cannot protect any data. |
| HR | hr , paloma.torres | The role can unprotect all PII data but cannot view any PCI data. When attempting to unprotect data without authorization, they will be displayed nulls. The role cannot protect any data. |
| OTHER | superuser | The role can perform any protect and unprotect operation. The role has been made available for testing only – we strongly advise against creating superuser roles in your environments. |
Additionally, you may type in any user name to simulate unauthorized user behavior.
Understanding the Data Elements
A list of supported data element is provided here. For a mapping of the Data Element and the Entity Type, refer to Supported Sensitive Entity Types.
For more information about the data elements policy, refer to Data Security Policy.
| Name | Description |
|---|---|
| name | Protect or unprotect name of a person |
| name_de | Protect or unprotect name of a person in the German language |
| name_fr | Protect or unprotect name of a person in the French language |
| address | Protect or unprotect an address |
| address_de | Protect or unprotect an address in the German language |
| address_fr | Protect or unprotect an address in the French language |
| city | Protect or unprotect a town or city |
| city_de | Protect or unprotect a town or city name in the German language |
| city_fr | Protect or unprotect a town or city name in the French language |
| postcode | Protect or unprotect a postal code with digits and characters |
| zipcode | Protect or unprotect a postal code with digits only |
| phone | Protect or unprotect a phone number |
| Protect or unprotect an email | |
| datetime | Protect or unprotect all components of a datetime string date, month, year, and time |
| datetime_yc | Protect or unprotect a datetime string. Year will be in clear. |
| int | Protect or unprotect a 4-byte integer string |
| nin | Protect or unprotect a National Insurance Number UK |
| ssn | Protect or unprotect a Social Security Number US |
| ccn | Protect or unprotect a Credit Card Number |
| ccn_bin | Protect or unprotect a Credit Card Number. Leaves 8-digit BIN in the clear. |
| passport | Protect or unprotect a passport number |
| iban | Protect or unprotect an International Banking Account Number |
| iban_cc | Protect or unprotect an International Banking Account Number. Leaves letters in the clear. |
| string | Protect or unprotect a string |
| number | Protect or unprotect a number |
| text | Protect or unprotect text using encryption |
Testing the sample file
Ensure that the required credentials are obtained and environment variables specified, using the steps from Optional - Obtaining access to the Developer Edition API Service.
Open a command prompt.
Navigate to the directory where Developer Edition is cloned.
Protect data using the following command.
python samples/sample-app-protection.py --input_data "John Smith" --policy_user superuser --data_element name --protectView the protected output.

Unprotect the data obtained from the earlier step using the following command.
python samples/sample-app-protection.py --input_data "<protected_data>" --policy_user superuser --data_element name --unprotectView the unprotected output.

Encrypt data using the following command.
python samples/sample-app-protection.py --input_data "John Smith" --policy_user superuser --data_element text --encView the encrypted output.

Decrypt the data obtained from the earlier step using the following command.
python samples/sample-app-protection.py --input_data "<encrypted_data>" --policy_user superuser --data_element text --decView the decrypted output.

Use the help command for more information about using the sample file.
python samples/sample-app-protection.py --help
Additional use cases for user role behavior for data protection
This section demonstrates the expected behavior of various user roles when running the sample-app-protection.py. Each section describes the permissions and restrictions for a role, followed by example commands and their outputs.
ADMIN
Users: admin, devops, jay.banerjee
This role can protect all data but cannot unprotect. When attempting to unprotect, protected values are displayed.
python sample-app-protection.py --input_data "Protegrity$" --policy_user devops --data_element name --protect --protect
python sample-app-protection.py --input_data "2839874358655598" --policy_user admin --data_element ccn --protect --protect
python sample-app-protection.py --input_data "CxWHeztVNp$" --policy_user jay.banerjee --data_element name --protect --unprotect
python sample-app-protection.py --input_data "6211214171366290" --policy_user admin --data_element ccn --protect --unprotect

FINANCE
Users: finance, robin.goodwill
This role can unprotect all PII and PCI data. The role cannot protect any data. When attempting to unprotect data without authorization, the value Null is displayed.
python sample-app-protection.py --input_data "xzrT sqdVc" --policy_user finance --data_element name --unprotect
python sample-app-protection.py --input_data "4321567898765432" --policy_user finance --data_element ccn --unprotect
python sample-app-protection.py --input_data "John Smith" --policy_user finance --data_element name --protect
python sample-app-protection.py --input_data "2839874358655598" --policy_user robin.goodwill --data_element ccn --protect
python sample-app-protection.py --input_data "1998/10/11" --policy_user finance --data_element datetime --unprotect
python sample-app-protection.py --input_data "1998/10/11" --policy_user robin.goodwill --data_element datetime --unprotect

MARKETING
Users: marketing, merlin.ishida
This role can unprotect some PII data that is required for analytical research and campaign outreach. The role cannot protect any data. When attempting to unprotect data without authorization, the value Null is displayed.
python sample-app-protection.py --input_data "DnZQHKcpVJ, J.G." --policy_user marketing --data_element city --unprotect
python sample-app-protection.py --input_data "4321567898765432" --policy_user merlin.ishida --data_element ccn --unprotect
python sample-app-protection.py --input_data "Washington, D.C." --policy_user marketing --data_element city --protect
python sample-app-protection.py --input_data "2839874358655598" --policy_user merlin.ishida --data_element ccn --protect

HR
Users: hr, paloma.torres
This role can unprotect all PII data but cannot view any PCI data. The role cannot protect any data. When attempting to unprotect data without authorization, the value Null is displayed.
python sample-app-protection.py --input_data "2839874358655598" --policy_user paloma.torres --data_element ccn --unprotect
python sample-app-protection.py --input_data "CIF123654987" --policy_user hr --data_element passport --unprotect
python sample-app-protection.py --input_data "John Doe" --policy_user hr --data_element name --protect
python sample-app-protection.py --input_data "John Doe" --policy_user paloma.torres --data_element name --protect
python sample-app-protection.py --input_data "4321567898765432" --policy_user paloma.torres --data_element ccn --protect

OTHER
User: superuser
This role can perform any protect and unprotect operation. The role is only made available for testing. It is strongly advised against creating superuser roles in an environment.
python sample-app-protection.py --input_data "John Smith" --policy_user superuser --data_element name --protect --unprotect
python sample-app-protection.py --input_data "2839874358655598" --policy_user superuser --data_element ccn --protect --unprotect

4.1 - Data Discovery API
The Data Discovery service exposes its API on port 8580.
Data Discovery Classification Service
This API identifies, classifies, and locates sensitive data.
Endpoint
http://{Host Address}:8580/pty/data-discovery/v1.0/classify
Path
/pty/data-discovery/v1.0/classify
Method
POST
Parameters
Define the value in the score_threshold parameter to exclude results with a low score. This parameter is optional and accepts the following values:
Type: float
Values: minimum 0, maximum 1.0
Default: 0.00
For example, score_threshold = 0.75
Example Data
You can reach Dave Elliot by phone 203-555-1286.
The data should be in UTF-8 format. Also, the limit on the length of the characters is 10,000.
Sample Request
http://{Host Address}:8580/pty/data-discovery/v1.0/classify
Response Codes
Successful Response.{
"providers": [
{
"name": "Presidio Classification Provider",
"version": "1.0.0",
"status": 200,
"elapsed_time": 1.014178991317749,
"exception": null,
"config_provider": {
"name": "Presidio",
"address": "http://presidio_provider_service",
"supported_content_types": []
}
},
{
"name": "Roberta Classification Provider",
"version": "1.0.0",
"status": 200,
"elapsed_time": 19.091534852981567,
"exception": null,
"config_provider": {
"name": "Roberta",
"address": "http://roberta_provider_service",
"supported_content_types": []
}
}
],
"classifications": {
"PERSON": [
{
"score": 0.9236000061035157,
"location": {
"start_index": 14,
"end_index": 25
},
"classifiers": [
{
"provider_index": 0,
"name": "SpacyRecognizer",
"score": 0.85,
"details": {}
},
{
"provider_index": 1,
"name": "roberta",
"score": 0.9972000122070312,
"details": {}
}
]
}
],
"PHONE_NUMBER": [
{
"score": 0.8746500015258789,
"location": {
"start_index": 35,
"end_index": 47
},
"classifiers": [
{
"provider_index": 0,
"name": "PhoneRecognizer",
"score": 0.75,
"details": {}
},
{
"provider_index": 1,
"name": "roberta",
"score": 0.9993000030517578,
"details": {}
}
]
}
]
}
}Request must have a body, but no request body was provided.Payload too large.Unsupported media type.Unexpected internal server error. Check server logs.Internal server error. Check server logs.Sample Request
curl -X POST "http://<SERVER_IP>:8580/pty/data-discovery/v1.0/classify?score_threshold=0.85" \
-H "Content-Type: text/plain" \
--data "You can reach Dave Elliot by phone 203-555-1286"import requests
url = "http://<SERVER_IP>:8580/pty/data-discovery/v1.0/classify"
params = {"score_threshold": 0.85}
headers = {"Content-Type": "text/plain"}
data = "You can reach Dave Elliot by phone 203-555-1286"
response = requests.post(url, params=params, headers=headers, data=data, verify=False)
print("Status code:", response.status_code)
print("Response JSON:", response.json())URL: POST `http://<SERVER_IP>:8580/pty/data-discovery/v1.0/classify`
Query Parameters:
-score_threshold (optional), float between 0.0 and 1.0, default: 0.
Headers:
-Content-Type: text/plain
Body:
-You can reach Dave Elliot by phone 203-555-12864.2 - Application Protector Python APIs
The various APIs supported by the AP Python are described in this section. It describes the syntax of the AP Python APIs and provides the sample use cases.
Before running the APIs in this section, ensure that the required credentials are obtained and environment variables specified, using the steps from Optional - Obtaining access to the AI Developer Edition API Service.
Initialize the protector
The Protector API returns the Protector object associated with the AP Python APIs. After instantiation, this object is used to create a session. The session object provides APIs to perform the protect, unprotect, or reprotect operations.
Protector(self)
Note: Do not pass the
selfparameter while invoking the API.
Parameters
None
Returns
Protector: Object associated with the AP Python APIs.
Exceptions
InitializationError: This exception is thrown if the protector fails to initialize.
Example
In the following example, the AP Python is initialized.
from appython import Protector
protector = Protector()
create_session
The create_session API creates a new session. The sessions that are created using this API, automatically time out after the session timeout value has been reached. The default session timeout value is 15 minutes. However, you can also pass the session timeout value as a parameter to this API.
Note: If the session is invalid or has timed out, then the AP Python APIs that are invoked using this session object, may throw an
InvalidSessionErrorexception. Application developers can catch theInvalidSessionErrorexception and create a session by again by invoking thecreate_sessionAPI.
def create_session(self, policy_user, timeout=15)
Note: Do not pass the
selfparameter while invoking the API.
Parameters
policy_user: Username defined in the policy, as a string value.
timeout: Session timeout, specified in minutes. By default, the value of this parameter is set to 15. This parameter is optional.
Returns
session: Object of the Session class. A session object is required for calling the data protection operations, such as, protect, unprotect, and reprotect.
Exceptions
ProtectorError: This exception is thrown if a null or empty value is passed as the policy_user parameter.
Example
In the following example, superuser is passed as the policy_user parameter.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
get_version
The get_version API returns the version of the AP Python in use. Ensure that the version number of the AP Python matches with the AP Python build package.
Note: You do not need to create a session for invoking the
get_versionAPI.
def get_version(self)
Note: Do not pass the
selfparameter while invoking the API.
Parameters
None
Returns
String: Product version of the installed AP Python.
Exceptions
None
Example
In the following example, the current version of the installed AP Python is retrieved.
from appython import Protector
protector = Protector()
print(protector.get_version())
Result
1.0.0
protect
The protect API protects the data using tokenization, data type preserving encryption, No Encryption, or encryption data element. It supports both single and bulk protection without a maximum bulk size limit. However, you are recommended not to pass more than 1 MB of input data for each protection call.
For String and Byte data types, the maximum length for tokenization is 4096 bytes, while no maximum length is defined for encryption.
def protect(self, data, de, **kwargs)
Note: Do not pass the self parameter while invoking the API.
Parameters
data: Data to be protected. You can provide the data of any type that is supported by the AP Python. For example, you can specify data of type string, or integer. However, you cannot provide the data of multiple data types at the same time in a bulk call.
de: String containing the data element name defined in policy.
**kwargs: Specify one or more of the following keyword arguments:
- external_iv: Specify the external initialization vector for Tokenization. This argument is optional.
- encrypt_to: Specify this argument for encrypting the data and set its value to bytes. This argument is Mandatory. It must not be used for Tokenization.
- charset: This is an optional argument. It indicates the byte order of the input buffer. You can specify a value for this argument from the charset constants, such as, UTF8, UTF16LE, or UTF16BE. The default value for the charset argument is UTF8.
The charset argument is only applicable for the input data of byte type.
The charset parameter is mandatory for the data elements created with Unicode Gen2 tokenization method for byte APIs. The encoding set for the charset parameter must match the encoding of the input data passed.
Note: Keyword arguments are case sensitive.
Returns
- For single data: Returns the protected data
- For bulk data: Returns a tuple of the following data:
- List or tuple of the protected data
- Tuple of error codes
Exceptions
InvalidSessionError: This exception is thrown if the session is invalid or has timed out.
ProtectError: This exception is thrown if the API is unable to protect the data.
If the protect API is used with bulk data, then it does not throw any exception. Instead, it only returns an error code.
For more information about the return codes, refer to Application Protector Return Codes.
Example - Tokenizing String Data
The examples for using the protect API for tokenizing the string data are described in this section.
Example 1: Input string data
In the following example, the Protegrity1 string is used as the data, which is tokenized using the
string Alpha Numeric data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("Protegrity1", "string")
print("Protected Data: %s" %output)
Result
Protected Data: 4l0z9SQrhtk
Example 2: Input string data using session as Context Manager
In the following example, the Protegrity1 string is used as the data, which is tokenized using the string Alpha Numeric data element.
from appython import Protector
protector = Protector()
with protector.create_session("superuser") as session:
output = session.protect("Protegrity1", "string")
print("Protected Data: %s" %output)
Result
Protected Data: 4l0z9SQrhtk
Example 3: Input date passed as a string
In the following example, the 1998/05/29 string is used as the data, which is tokenized using the datetime Date data element.
If a date string is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date string in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("1998/05/29", "datetime")
print("Protected data: "+str(output))
Result
Protected data: 0634/01/28
Example 4: Input date and time passed as a string
In the following example, the 1998/05/29 10:54:47 string is used as the data, which is tokenized using the datetime Datetime data element.
If a date and time string is provided as input, then the data element with the same tokenization type as the input format must be used for data protection. For example, if the input date and time string in YYYY/MM/DD HH:MM:SS MMM format is provided, then only the Datetime (YYYY-MM-DD HH:MM:SS MMM) data element must be used to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("1998/05/29 10:54:47", "datetime")
print("Protected data: "+str(output))
Result
Protected data: 0634/01/28 10:54:47
Example 5: Unicode Input passed as a String
In the following example, the ‘protegrity1234ÀÁÂÃÄÅÆÇÈÉ’ unicode data is used as the input data, which is tokenized using the string data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect('protegrity1234ÀÁÂÃÄÅÆÇÈÉ', "string")
print("Protected Data: %s" %output)
Result
Protected Data: VSYaLoLxo8GMyqÀÁÂÃÄÅÆÇÈÉ
Example - Tokenizing String Data with External Initialization Vector (IV)
The example for using using the protect API for tokenizing string data using external initialization vector (IV) is described in this section.
If you want to pass the external IV as a keyword argument to the protect API, then you must first pass the external IV as bytes to the API.
Example
In this example, the Protegrity1 string is used as the data tokenized using the string data element, with the help of the external IV 1234 passed as bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("Protegrity1", "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %output)
Result
Protected Data: oEquECC2JYb
Example - Encrypting String Data
The example for using the protect API for encrypting the string data is described in this section.
If you want to encrypt the data, then you must use bytes in the encrypt_to keyword.
To avoid data corruption, do not convert the encrypted bytes data into the string format. It is recommended to convert the encrypted bytes data to a Hexadecimal, Base 64, or any other appropriate format.
Example
In the following example, the Protegrity1 string is used as the data, which is encrypted using the text data element (generic placeholder for an encryption-capable element). Therefore, the encrypt_to parameter is passed as a keyword argument and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("Protegrity1", "text",
encrypt_to=bytes)
print("Encrypted Data: %s" %output)
Result
Encrypted Data: b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V'
Example - Tokenizing Bulk String Data
The example for using the protect API for tokenizing bulk string data is described in this section. The bulk string data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
Example 1: Input bulk string data
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data, which is tokenized using the string data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
Result
Protected Data:
(['VSYaLoLxo8GMyq', '4l0z9SQrhtk', '9xP5wBuXJuce'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example 2: Input bulk string data
In Example 1, the protected output was a tuple of the tokenized data and the error list. This example shows how the code can be tweaked to ensure that the protected output and the error list are retrieved separately, and not as part of a tuple.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out, error_list = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
print("Error List: ")
print(error_list)
Result
Protected Data:
['VSYaLoLxo8GMyq', '4l0z9SQrhtk', '9xP5wBuXJuce']
Error List:
(6, 6, 6)
6 is the success return code for the protect operation of each element in the list.
Example 3: Input date passed as bulk strings
In the following example, the 2019/02/14 and 2018/03/11 strings are stored in a list and used as bulk data, which is tokenized using the datetime Date data element.
If a date string is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date string in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["2019/02/14", "2018/03/11"]
output = session.protect(data, "datetime")
print("Protected data: "+str(output))
Result
Protected data: (['1072/07/29', '0907/12/30'], (6, 6))
6 is the success return code for the protect operation of each element in the list.
Example 4: Input date and time passed as bulk strings
In the following example, the 2019/02/14 10:54:47 and 2019/11/03 11:01:32 strings is used as the data, which is tokenized using the datetime Datetime data element.
If a date and time string is provided as input, then the data element with the same tokenization type as the input format must be used for data protection. For example, if you have provided the input date and time string in YYYY/MM/DD HH:MM:SS MMM format, then you must use only the Datetime (YYYY-MM-DD HH:MM:SS MMM) data element to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["2019/02/14 10:54:47", "2019/11/03 11:01:32"]
output = session.protect(data, "datetime")
print("Protected data: "+str(output))
Result
Protected data: (['1072/07/29 10:54:47', '2249/12/17 11:01:32'], (6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Encrypting Bulk String Data
The example for using the protect API for encrypting bulk string data is described in this section. The bulk string data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data, which is encrypted using the text data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "text", encrypt_to=bytes)
print("Encrypted Data: ")
print(p_out)
Result
Encrypted Data:
([b"I\xc1\xf0S\x0f\xaf\t\x06\xb5;\xb5'%\xab\x9b\x18", b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V', b'\xfd\x99\xa7\xd1V(\x02K\xc9\xbdZ\x97\xd6\xea\xcc\x13'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Bulk String Data with External IV
The example for using the protect API for tokenizing bulk string data using external IV is described in this section. The bulk string data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you want to pass the external IV as a keyword argument to the protect API, then you must pass external IV as bytes.
Example
In this example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data. This bulk data is tokenized using the string data element, with the help of external IV 123 that is passed as bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "string",
external_iv=bytes("123", encoding="utf-8"))
print("Protected Data: ")
print(p_out)
Result
Protected Data:
(['qMrwdI3iiT9D14', 'JpytdIbc16c', 'fTY1RhNGRJAa'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Integer Data
The example for using the protect API for tokenizing integer data is described in this section.
Example
In the following example, 21 is used as the integer data, which is tokenized using the int data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect(21, "int")
print("Protected Data: %s" %output)
Result
Protected Data: -94623223
Example - Tokenizing Integer Data with External IV
The example for using the protect API for tokenizing integer data using the external IV is described in this section.
If you want to pass the external IV as a keyword argument to the protect API, then you must pass the external IV as bytes to the API.
Example
In this example, 21 is used as the integer data, which is tokenized using the int data element, with the help of external IV 1234 passed as bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect(21, "int", external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %output)
Result
Protected Data: 1983567415
Example - Encrypting Integer Data
The example for using the protect API for encrypting integer data is described in this section.
If you want to encrypt the data, then you must use bytes in the encrypt_to keyword.
To avoid data corruption, do not convert the encrypted bytes data into string format. It is recommended to convert the encrypted bytes data to a Hexadecimal, Base 64, or any other appropriate format.
Example
In the following example, 21 is used as the integer data, which is encrypted using the text data element. Therefore, the encrypt_to parameter is passed as a keyword argument, and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect(21, "text", encrypt_to=bytes)
print("Encrypted Data: %s" %output)
Result
Encrypted Data: b'\xf73\xb9\x7f\x94\xdf;\xbd\x02=\x877\x91]\x1b#'
Example - Tokenizing Bulk Integer Data
The example for using the protect API for tokenizing bulk integer data is described in this section. The bulk integer data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
Example
In the following example, 21, 42, and 55 integers are stored in a list and used as bulk data, which is tokenized using the int data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "int")
print("Protected Data: ")
print(p_out)
Result
Protected Data:
([-94623223, -572010955, 2021989009], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Bulk Integer Data with External IV
The example for using the protect API for tokenizing bulk integer data using external IV is described in this section. The bulk integer data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you want to pass the external IV as a keyword argument to the protect API, then you must pass the external IV as bytes to the API.
Example
In the following example, 21, 42, and 55 integers are stored in a list and used as bulk data, which is tokenized using the int data element, with the help of external IV 1234 that is passed as bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "int", external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: ")
print(p_out)
Result
Protected Data:
([1983567415, -1471024670, 1465229692], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Encrypting Bulk Integer Data
The example for using the protect API for encrypting bulk integer data is described in this section. The bulk integer data can be passed as a list or a tuple.
If you want to encrypt the data, then you must use bytes in the encrypt_to keyword.
To avoid data corruption, do not convert the encrypted bytes data into string format. It is recommended to convert the encrypted bytes data to a Hexadecimal, Base 64, or any other appropriate format.
Example
In the following example, 21, 42, and 55 integers are stored in a list and used as bulk data, which is encrypted using the text data element. Therefore, the encrypt_to parameter is passed as a keyword argument and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "text", encrypt_to=bytes)
print("Encrypted Data: ")
print(p_out)
Result
Encrypted Data:
([b'\xf73\xb9\x7f\x94\xdf;\xbd\x02=\x877\x91]\x1b#', b'\x13\x92\xcd+\xb5\xb5\x8a\x98-$3\xa4\x00bNx', b'\xe5\xa1C\xf4HI\xe8\xe1F\x90=\xd9\xb4*pG'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Bytes Data
The example for using the protect API for tokenizing bytes data is described in this section.
Example
In the following example, “Protegrity1” string is first converted to bytes using the Python bytes() method. The bytes data is then tokenized using the string data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "string")
print("Protected Data: %s" %p_out)
Result
Protected Data: b'4l0z9SQrhtk'
Example - Tokenizing Bytes Data with External IV
The example for using the protect API for tokenizing bytes data using external IV is described in this section.
Example
In the following example, “Protegrity1” string is first converted to bytes using the Python bytes() method. The bytes data is then tokenized using the string data element, with the help of external IV 1234 that is passed as bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
output = session.protect(data, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %output)
Result
Protected Data: b'oEquECC2JYb'
Example - Encrypting Bytes Data
The example for using the protect API for encrypting bytes data is described in this section.
To avoid data corruption, do not convert the encrypted bytes data into string format. It is recommended to convert the encrypted bytes data to a Hexadecimal, Base 64, or any other appropriate format.
Example
In the following example, “Protegrity1” string is first converted to bytes using the Python bytes() method. The bytes data is then encrypted using the text
data element. Therefore, the encrypt_to parameter is passed as a keyword argument and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "text", encrypt_to = bytes)
print("Encrypted Data: %s" %p_out)
Result
Encrypted Data: b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V'
Example - Tokenizing Bulk Bytes Data
The example for using the protect API for tokenizing bulk bytes data. The bulk bytes data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes
are then stored in a list and used as bulk data, which is tokenized using the string data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding="UTF-8"), bytes("Protegrity1",
encoding="UTF-8"), bytes("Protegrity56", encoding="UTF-8")]
p_out = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
Result
Protected Data:
([b'VSYaLoLxo8GMyq', b'4l0z9SQrhtk', b'9xP5wBuXJuce'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Bulk Bytes Data with External IV
The example for using the protect API for tokenizing bulk bytes data using
external IV is described in this section. The bulk bytes data can be passed as a
list or a tuple.
The individual elements of the list or tuple must be of the same data type.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings
are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data. This bulk data is tokenized using the string data element, with the help of external IV 1234 that is passed as bytes.
Example - Encrypting Bulk Bytes Data
The example for using the protect API for encrypting bulk bytes data is described in this section. The bulk bytes data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
To avoid data corruption, do not convert the encrypted bytes data into string format. It is recommended to convert the encrypted bytes data to a Hexadecimal, Base 64, or any other appropriate format.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is encrypted using the text data element. Therefore, the encrypt_to parameter is passed as a keyword argument and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding="UTF-8"), bytes("Protegrity1",
encoding="UTF-8"), bytes("Protegrity56", encoding="UTF-8")]
p_out = session.protect(data, "text", encrypt_to = bytes)
print("Encrypted Data: ")
print(p_out)
Result
Encrypted Data:
([b"I\xc1\xf0S\x0f\xaf\t\x06\xb5;\xb5'%\xab\x9b\x18", b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V', b'\xfd\x99\xa7\xd1V(\x02K\xc9\xbdZ\x97\xd6\xea\xcc\x13'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Bytes Data
The example for using the protect API for tokenizing bytes data is described in this section.
Example
In the following example, “Protegrity1” string is first converted to bytes using the Python bytes() method. The bytes data is then tokenized using the string data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "string")
print("Protected Data: %s" %p_out)
Result
Protected Data: b'4l0z9SQrhtk'
In the following example, “Protegrity1” string is first converted to bytes using the Python bytes() method. The bytes data is then tokenized using the string data element.
from appython import Protector
from appython import Charset
protector = Protector()
session = protector.create_session("superuser")
data = bytes("Protegrity1", encoding="utf-16le")
p_out = session.protect(data, "string", encrypt_to=bytes, charset=Charset.UTF16LE)
print("Protected Data: %s" %p_out)
Result
Protected Data: b'4\x00l\x000\x00z\x009\x00S\x00Q\x00r\x00h\x00t\x00k\x00'
Example - Tokenizing Bulk Bytes Data
The example for using the protect API for tokenizing bulk bytes data is described in this section. The bulk bytes data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is tokenized using the string data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding="UTF-8"), bytes("Protegrity1",
encoding="UTF-8"), bytes("Protegrity56", encoding="UTF-8")]
p_out = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
Result
Protected Data:
([b'VSYaLoLxo8GMyq', b'4l0z9SQrhtk', b'9xP5wBuXJuce'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Bulk Bytes Data with External IV
The example for using the protect API for tokenizing bulk bytes data using external IV is described in this section. The bulk bytes data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is tokenized using the string data element,
with the help of external IV 1234 that is passed as bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding="UTF-8"), bytes("Protegrity1",
encoding="UTF-8"), bytes("Protegrity56", encoding="UTF-8")]
p_out = session.protect(data, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: ")
print(p_out)
Result
Protected Data:
([b'aCzyqwijkSDqiG', b'oEquECC2JYb', b't0Ly7KYx7Wyo'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Encrypting Bulk Bytes Data
The example for using the protect API for encrypting bulk bytes data is described in this section. The bulk bytes data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
To avoid data corruption, do not convert the encrypted bytes data into string format. It is recommended to convert the encrypted bytes data to a Hexadecimal, Base 64, or any other appropriate format.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is encrypted using the text data element. Therefore, the encrypt_to parameter is passed as a keyword argument and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding="UTF-8"), bytes("Protegrity1",
encoding="UTF-8"), bytes("Protegrity56", encoding="UTF-8")]
p_out = session.protect(data, "text", encrypt_to = bytes)
print("Encrypted Data: ")
print(p_out)
Result
Encrypted Data:
([b"I\xc1\xf0S\x0f\xaf\t\x06\xb5;\xb5'%\xab\x9b\x18", b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V', b'\xfd\x99\xa7\xd1V(\x02K\xc9\xbdZ\x97\xd6\xea\xcc\x13'], (6, 6, 6))
6 is the success return code for the protect operation of each element in the list.
Example - Tokenizing Date Objects
The examples for using the protect API for tokenizing the date objects are described in this section.
If a date string is provided as input, then the data element with the same tokenization type as the input format must be used for data protection. For example, if you have provided the input date object in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
Example : Input date object in YYYY/MM/DD format
In the following example, the 1998/05/29 date string is used as the data, which is first converted to a date object using the Python date method of the datetime module.
The date object is then tokenized using the datetime data element.
from appython import Protector
from datetime import datetime
protector = Protector()
session = protector.create_session("superuser")
data = datetime.strptime("1998/05/29", "%Y/%m/%d").date()
print("\nInput date as a Date object : "+str(data))
p_out = session.protect(data, "datetime")
print("Protected date: "+str(p_out))
Result
Input date as a Date object : 1998-05-29
Protected date: 0634-01-28
Example - Tokenizing Bulk Date Objects
The example for using the protect API for tokenizing bulk date objects is described in this section. The bulk date objects can passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If a date object is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date object in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
Example: Input as a Date Object
In the following example, the 2019/02/12 and 2018/01/11 date strings are used as the data, which are first converted to a date objects using the Python date method of the datetime module. The two date objects are then used to create a list, which is used as the input data.
The input list is then tokenized using the datetime data element.
from appython import Protector
from datetime import datetime
protector = Protector()
session = protector.create_session("superuser")
data1 = datetime.strptime("2019/02/12", "%Y/%m/%d").date()
data2 = datetime.strptime("2018/01/11", "%Y/%m/%d").date()
data = [data1, data2]
print("Input data: ", str(data))
p_out = session.protect(data, "datetime")
print("Protected data: "+str(p_out))
Result
Input data: [datetime.date(2019, 2, 12), datetime.date(2018, 1, 11)]
Protected data: ([datetime.date(1154, 10, 29), datetime.date(1543, 1, 5)], (6, 6))
6 is the success return code for the protect operation of each element in the list.
unprotect
This function returns the data in its original form.
def unprotect(self, data, de, **kwargs)
Note: Do not pass the
selfparameter while invoking the API.
Parameters
data: Data to be unprotected.
de: String containing the data element name defined in policy.
**kwargs: Specify one or more of the following keyword arguments:
- external_iv: Specify the external initialization vector for Tokenization. This argument is optional.
- decrypt_to: Specify this argument for decrypting the data and set its value to the data type of the original data. For example, if you are unprotecting a string data, then you must specify the output data type as str. This argument is Mandatory. This argument must not be used for Tokenization. The possible values for the decrypt_to argument are:
- str
- int
- bytes
- charset: This is an optional argument. It indicates the byte order of the input buffer. You can specify a value for this argument from the charset constants, such as, UTF8, UTF16LE, or UTF16BE. The default value for the charset argument is UTF8.
The charset argument is only applicable for the input data of byte type.
The charset parameter is mandatory for the data elements created with Unicode Gen2 tokenization method for byte APIs. The encoding set for the charset parameter must match the encoding of the input data passed.
Keyword arguments are case sensitive.
Returns
- For single data: Returns the unprotected data
- For bulk data: Returns a tuple of the following data:
- List or tuple of the unprotected data
- Tuple of error codes
Exceptions
InvalidSessionError: This exception is thrown if the session is invalid or has timed out.
ProtectError: This exception is thrown if the API is unable to protect the data.
If the unprotect API is used with bulk data, then it does not throw any exception. Instead, it only returns an error code.
For more information about the return codes, refer to Application Protector API Return Codes.
Example - Detokenizing String Data
The examples for using the unprotect API for retrieving the original string data from the token data are described in this section.
Example 1: Input string data
In the following example, the Protegrity1 string that was tokenized using the string data element, is now detokenized using the same data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("Protegrity1", "string")
print("Protected Data: %s" %output)
org = session.unprotect(output, "string")
print("Unprotected Data: %s" %org)
Result
Protected Data: 4l0z9SQrhtk
Unprotected Data: Protegrity1
Example 2: Input date passed as a string
In the following example, the 1998/05/29 string that was tokenized using the datetime Date data element, is now detokenized using the same data element.
If a date string is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date string in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("1998/05/29", "datetime")
print("Protected data: "+str(output))
org = session.unprotect(output, "datetime")
print("Unprotected data: "+str(org))
Result
Protected data: 0634/01/28
Unprotected data: 1998/05/29
Example 3: Input date and time passed as a string
In the following example, the 1998/05/29 10:54:47 string that was tokenized using the datetime Datetime data element is now detokenized using the same data element.
If a date and time string is provided as input, then the data element with the same tokenization type as the input format must be used for data protection. For example, if the input date and time string in YYYY/MM/DD HH:MM:SS MMM format is provided, then only the Datetime (YYYY-MM-DD HH:MM:SS MMM) data element must be used to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("1998/05/29 10:54:47", "datetime")
print("Protected data: "+str(output))
org = session.unprotect(output, "datetime")
print("Unprotected data: "+str(org))
Result
Protected data: 0634/01/28 10:54:47
Unprotected data: 1998/05/29 10:54:47
Example 4: Detokenizing Unicode Data passed as String
In the following example, the ‘protegrity1234ÀÁÂÃÄÅÆÇÈÉ’ unicode data that was tokenized using the string data element, is now detokenized using the same data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect('protegrity1234ÀÁÂÃÄÅÆÇÈÉ', "string")
print("Protected Data: %s" %output)
org = session.unprotect(output, "string")
print("Unprotected Data: %s" %org)
Result
Protected Data: VSYaLoLxo8GMyqÀÁÂÃÄÅÆÇÈÉ
Unprotected Data: protegrity1234ÀÁÂÃÄÅÆÇÈÉ
Example - Detokenizing String Data with External IV
The example for using the unprotect API for retrieving the original string data from token data, using external IV is described in this section.
If you want to pass the external IV as a keyword argument to the unprotect API, then you must pass the external IV as bytes to the API.
Example
In the following example, the Protegrity1 string that was tokenized using the string data element and the external IV 1234 is now detokenized using the same
data element and external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("Protegrity1", "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %output)
org = session.unprotect(output, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Unprotected Data: %s" %org)
Result
Protected Data: oEquECC2JYb
Unprotected Data: Protegrity1
Example - Decrypting String Data
The example for using the unprotect API for decrypting string data is described in this section.
If you want to decrypt the data, then you must use bytes in the decrypt_to keyword.
Example
In the following example, the Protegrity1 string that was encrypted using the text data element is now decrypted using the same data element. Therefore,
the decrypt_to parameter is passed as a keyword argument and its value is set to str.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("Protegrity1", "text",
encrypt_to=bytes)
print("Encrypted Data: %s" %output)
org = session.unprotect(output, "text", decrypt_to=str)
print("Decrypted Data: %s" %org)
Result
Encrypted Data: b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V'
Decrypted Data: Protegrity1
Example - Detokenizing Bulk String Data
The examples for using the unprotect API for retrieving the original bulk string data from the token data are described in this section.
Example 1: Input bulk string data
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data, which is tokenized using the string data element. The bulk string data is then detokenized using the same data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
out = session.unprotect(p_out[0], "string")
print("Unprotected Data: ")
print(out)
Result
Protected Data:
(['VSYaLoLxo8GMyq', '4l0z9SQrhtk', '9xP5wBuXJuce'], (6, 6, 6))
Unprotected Data:
(['protegrity1234', 'Protegrity1', 'Protegrity56'], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example 2: Input bulk string data
In Example 1, the unprotected output was a tuple of the detokenized data and the error list. This example shows how the code can be tweaked to ensure that the unprotected output and the error list are retrieved separately, and not as part of a tuple.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = "protegrity1234"
data = [data]*5
p_out, error_list = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
print("Error List: ")
print(error_list)
org, error_list = session.unprotect(p_out, "string")
print("Unprotected Data: ")
print(org)
print("Error List: ")
print(error_list)
Result
Protected Data:
['VSYaLoLxo8GMyq', 'VSYaLoLxo8GMyq', 'VSYaLoLxo8GMyq', 'VSYaLoLxo8GMyq', 'VSYaLoLxo8GMyq']
Error List:
(6, 6, 6, 6, 6)
Unprotected Data:
['protegrity1234', 'protegrity1234', 'protegrity1234', 'protegrity1234', 'protegrity1234']
Error List:
(8, 8, 8, 8, 8)
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example 3: Input date passed as bulk strings
In the following example, the 2019/02/14 and 2018/03/11 strings are stored in a list and used as bulk data, which is tokenized using the datetime Date data element. The bulk string data is then detokenized using the same data element.
If a date string is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date string in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["2019/02/14", "2018/03/11"]
output = session.protect(data, "datetime")
print("Protected data: "+str(output))
org = session.unprotect(output[0], "datetime")
print("Unprotected data: "+str(org))
Result
Protected data: (['1072/07/29', '0907/12/30'], (6, 6))
Unprotected data: (['2019/02/14', '2018/03/11'], (8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example 4: Input date and time passed as bulk strings
In the following example, the 2019/02/14 10:54:47 and 2019/11/03 11:01:32 strings is used as the data, which is tokenized using the datetime Datetime data element. The bulk string data is then detokenized using the same data element.
If a date and time string is provided as input, then the data element with the same tokenization type as the input format must be used for data protection. For example, if you have provided the input date and time string in YYYY/MM/DD HH:MM:SS MMM format, then you must use only the Datetime (YYYY-MM-DD HH:MM:SS MMM) data element to protect the data.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["2019/02/14 10:54:47", "2019/11/03 11:01:32"]
output = session.protect(data, "datetime")
print("Protected data: "+str(output))
org = session.unprotect(output[0], "datetime")
print("Unprotected data: "+str(org))
Result
Protected data: (['1072/07/29 10:54:47', '2249/12/17 11:01:32'], (6, 6))
Unprotected data: (['2019/02/14 10:54:47', '2019/11/03 11:01:32'], (8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Detokenizing Bulk String Data with External IV
The example for using the unprotect API for retrieving the original bulk string data from token data using the external IV is described in this section.
If you want to pass the external IV as a keyword argument to the unprotect API, then you must pass the external IV as bytes to the API.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data, which is tokenized using the string data element, with the help of external IV 123 that is passed as bytes. The bulk string data is then detokenized using the same data element and external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "string",
external_iv=bytes("123", encoding="UTF-8"))
print("Protected Data: ")
print(p_out)
out = session.unprotect(p_out[0], "string",
external_iv=bytes("123", encoding="UTF-8"))
print("Unprotected Data: ")
print(out)
Result
Protected Data:
(['qMrwdI3iiT9D14', 'JpytdIbc16c', 'fTY1RhNGRJAa'], (6, 6, 6))
Unprotected Data:
(['protegrity1234', 'Protegrity1', 'Protegrity56'], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Decrypting Bulk String Data
The example for using the unprotect API for decrypting bulk string data is described in this section.
If you want to decrypt the data, then you must use bytes in the decrypt_to keyword.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data, which is encrypted using the text data element. The bulk string data is then decrypted using the same data element. Therefore, the decrypt_to parameter is passed as a keyword argument and its value is set to str.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "text", encrypt_to=bytes)
print("Encrypted Data: ")
print(p_out)
out = session.unprotect(p_out[0], "text", decrypt_to=str)
print("Decrypted Data: ")
print(out)
Result
Encrypted Data:
([b"I\xc1\xf0S\x0f\xaf\t\x06\xb5;\xb5'%\xab\x9b\x18", b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V', b'\xfd\x99\xa7\xd1V(\x02K\xc9\xbdZ\x97\xd6\xea\xcc\x13'], (6, 6, 6))
Decrypted Data:
(['protegrity1234', 'Protegrity1', 'Protegrity56'], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Detokenizing Integer Data
The example for using the unprotect API for retrieving the original integer data from token data is described in this section.
Example
In the following example, the integer data 21 that was tokenized using the int data element, is now detokenized using the same data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect(21, "int")
print("Protected Data: %s" %output)
org = session.unprotect(output, "int")
print("Unprotected Data: %s" %org)
Result
Protected Data: -94623223
Unprotected Data: 21
Example - Detokenizing Integer Data with External IV
The example for using the unprotect API for retrieving the original integer data from token data, using external IV is described in this section.
If you want to pass the external IV as a keyword argument to the unprotect API, then you must pass the external IV as bytes to the API.
Example
In the following example, the integer data 21 that was tokenized using the int data element and the external IV 1234 is now detokenized using the same data element and external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect(21, "int",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %output)
org = session.unprotect(output, "int",
external_iv=bytes("1234", encoding="utf-8"))
print("Unprotected Data: %s" %org)
Result
Protected Data: 1983567415
Unprotected Data: 21
Example - Decrypting Integer Data
The example for using the unprotect API for decrypting integer data is described in this section.
If you want to decrypt the data, then you must use bytes in the decrypt_to keyword.
Example
In the following example, the integer data 21 that was encrypted using the text data element is now decrypted using the same data element. Therefore, the decrypt_to parameter is passed as a keyword argument and its value is set to int.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect(21, "text", encrypt_to=bytes)
print("Encrypted Data: %s" %output)
org = session.unprotect(output, "text", decrypt_to=int)
print("Decrypted Data: %s" %org)
Result
Encrypted Data: b'\xf73\xb9\x7f\x94\xdf;\xbd\x02=\x877\x91]\x1b#'
Decrypted Data: 21
Example - Detokenizing Bulk Integer Data
The example for using the unprotect API for retrieving the original bulk integer data from token data is described in this section.
The AP Python APIs support integer values only between -2147483648 and 2147483648, both inclusive.
Example
In the following example, 21, 42, and 55 integers are stored in a list and used as bulk data, which is tokenized using the int data element. The bulk integer data is then detokenized using the same data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "int")
print("Protected Data: ")
print(p_out)
out = session.unprotect(p_out[0], "int")
print("Unprotected Data: ")
print(out)
Result
Protected Data:
([-94623223, -572010955, 2021989009], (6, 6, 6))
Unprotected Data:
([21, 42, 55], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Detokenizing Bulk Integer Data with External IV
The example for using the unprotect API for retrieving the original bulk integer data from token data using external IV is described in this section.
If you want to pass the external IV as a keyword argument to the unprotect API, then you must pass the external IV as bytes to the API.
Example
In this example, 21, 42, and 55 integers are stored in a list and used as bulk data. This bulk data is tokenized using the int data element, with the help of external IV 1234 that is passed as bytes.The bulk integer data is then detokenized using the same data element and external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "int", external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: ")
print(p_out)
out = session.unprotect(p_out[0], "int", external_iv=bytes("1234", encoding="utf-8"))
print("Unprotected Data: ")
print(out)
Result
Protected Data:
([1983567415, -1471024670, 1465229692], (6, 6, 6))
Unprotected Data:
([21, 42, 55], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Decrypting Bulk Integer Data
The example for using the unprotect API for decrypting bulk integer data is described in this section.
If you want to decrypt the data, then you must use bytes in the decrypt_to keyword.
Example
In the following example, 21, 42, and 55 integers are stored in a list and used as bulk data, which is encrypted using the text data element. The bulk integer data is then decrypted using the same data element. Therefore, the decrypt_to parameter is passed as a keyword argument and its value is set to int.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "text", encrypt_to=bytes)
print("Encrypted Data: ")
print(p_out)
out = session.unprotect(p_out[0], "text", decrypt_to=int)
print("Decrypted Data: ")
print(out)
Result
Encrypted Data:
([b'\xf73\xb9\x7f\x94\xdf;\xbd\x02=\x877\x91]\x1b#', b'\x13\x92\xcd+\xb5\xb5\x8a\x98-$3\xa4\x00bNx', b'\xe5\xa1C\xf4HI\xe8\xe1F\x90=\xd9\xb4*pG'], (6, 6, 6))
Decrypted Data:
([21, 42, 55], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Detokenizing Bytes Data
The example for using the unprotect API for retrieving the original bytes data from the token data is described in this section.
Example
In the following example, the bytes data ‘Protegrity1’ that was tokenized using the string data element, is now detokenized using the same data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "string")
print("Protected Data: %s" %p_out)
org = session.unprotect(p_out, "string")
print("Unprotected Data: %s" %org)
Result
Protected Data: b'4l0z9SQrhtk'
Unprotected Data: b'Protegrity1'
In the following example, the bytes data ‘Protegrity1’ that was tokenized using the string data element, is now detokenized using the same data element.
from appython import Protector
from appython import Charset
protector = Protector()
session = protector.create_session("superuser")
data = bytes("Protegrity1", encoding="utf-16le")
p_out = session.protect(data, "string", encrypt_to=bytes, charset=Charset.UTF16LE)
print("Protected Data: %s" %p_out)
org = session.unprotect(p_out, "string", decrypt_to=bytes, charset=Charset.UTF16LE)
print("Unprotected Data: %s" %org)
Result
Protected Data: b'4\x00l\x000\x00z\x009\x00S\x00Q\x00r\x00h\x00t\x00k\x00'
Unprotected Data: b'P\x00r\x00o\x00t\x00e\x00g\x00r\x00i\x00t\x00y\x001\x00'
Example - Detokenizing Bytes Data with External IV
The example for using the unprotect API for retrieving the original bytes data from the token data using external IV is described in this section.
Example
In this example, the bytes data ‘Protegrity1’ was tokenized using the string data element and the external IV 1234. It is now detokenized using the same data element and external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %p_out)
org = session.unprotect(p_out, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Unprotected Data: %s" %org)
Result
Protected Data: b'oEquECC2JYb'
Unprotected Data: b'Protegrity1'
Example - Decrypting Bytes Data
The example for using the unprotect API for decrypting bytes data is described in this section.
Example
In the following example, the bytes data b’Protegrity1’ that was encrypted using the text data element, is now decrypted using the same data element. Therefore, the decrypt_to parameter is passed as a keyword argument and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "text", encrypt_to=bytes)
print("Encrypted Data: %s" %p_out)
org = session.unprotect(p_out, "text", decrypt_to=bytes)
print("Decrypted Data: %s" %org)
Result
Encrypted Data: b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V'
Decrypted Data: b'Protegrity1'
Example - Detokenizing Bulk Bytes Data
The example for using the unprotect API for retrieving the original bulk bytes data from the token data is described in this section.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is tokenized using the string data element. The bulk bytes data is then detokenized using the same data element.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234","utf-8"), bytes("Protegrity1","utf-8"), bytes("Protegrity56","utf-8")]
p_out = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
org = session.unprotect(p_out[0], "string")
print("Unprotected Data: ")
print(org)
Result
Protected Data:
([b'VSYaLoLxo8GMyq', b'4l0z9SQrhtk', b'9xP5wBuXJuce'], (6, 6, 6))
Unprotected Data:
([b'protegrity1234', b'Protegrity1', b'Protegrity56'], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Detokenizing Bulk Bytes Data with External IV
The example for using the unprotect API for retrieving the original bulk bytes data from the token data using external IV is described in this section.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data. This bulk data is tokenized using the string data element, with the help of external IV 1234 passed as bytes. The bulk bytes data is then detokenized using the same data element and external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234","utf-8"), bytes("Protegrity1","utf-8"), bytes("Protegrity56","utf-8")]
p_out = session.protect(data, "string",
external_iv=bytes("1234","utf-8"))
print("Protected Data: ")
print(p_out)
org = session.unprotect(p_out[0], "string",
external_iv=bytes("1234","utf-8"))
print("Unprotected Data: ")
print(org)
Result
Protected Data:
([b'aCzyqwijkSDqiG', b'oEquECC2JYb', b't0Ly7KYx7Wyo'], (6, 6, 6))
Unprotected Data:
([b'protegrity1234', b'Protegrity1', b'Protegrity56'], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Decrypting Bulk Bytes Data
The example for using the unprotect API for decrypting bulk bytes data is described in this section.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is encrypted using the text data element. The bulk bytes
data is then decrypted using the same data element. Therefore, the decrypt_to parameter is passed as a keyword argument and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding ="UTF-8"), bytes("Protegrity1", encoding
="UTF-8"), bytes("Protegrity56", encoding ="UTF-8")]
p_out = session.protect(data, "text", encrypt_to=bytes)
print("Encrypted Data: ")
print(p_out)
org = session.unprotect(p_out[0], "text", decrypt_to=bytes)
print("Decrypted Data: ")
print(org)
Result
Encrypted Data:
([b"I\xc1\xf0S\x0f\xaf\t\x06\xb5;\xb5'%\xab\x9b\x18", b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V', b'\xfd\x99\xa7\xd1V(\x02K\xc9\xbdZ\x97\xd6\xea\xcc\x13'], (6, 6, 6))
Decrypted Data:
([b'protegrity1234', b'Protegrity1', b'Protegrity56'], (8, 8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
Example - Detokenizing Date Objects
The example for using the unprotect API for retrieving the original data objects from token data is described in this section.
If a date object is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date object in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
Example 1: Input date object in MM.DD.YYYY format
In this example, the 2019/12/02 date string is used as the data, which is first converted to a date object using the Python date method of the datetime module.
The date object is then tokenized using the datetime data element, and then detokenized using the same data element.
from appython import Protector
from datetime import datetime
protector = Protector()
session = protector.create_session("superuser")
data = datetime.strptime("2019/12/02", "%Y/%m/%d").date()
print("\nInput date as a Date object : "+str(data))
p_out = session.protect(data, "datetime")
print("Protected date: "+str(p_out))
unprotected_output = session.unprotect(p_out, "datetime")
print("Unprotected date: "+str(unprotected_output))
Result
Input date as a Date object : 2019-12-02
Protected date: 2936-03-31
Unprotected date: 2019-12-02
Example 2: Input date object in YYYY-MM-DD format
In this example, the 2019/02/12 date string is used as the data, which is first converted to a date object using the Python date method of the datetime module.
The date object is then tokenized using the datetime data element, and then detokenized using the same data element.
from appython import Protector
from datetime import datetime
protector = Protector()
session = protector.create_session("superuser")
data = datetime.strptime("2019/02/12", "%Y/%m/%d").date()
print("\nInput date as a Date object : "+str(data))
p_out = session.protect(data, "datetime")
print("Protected date: "+str(p_out))
unprotected_output = session.unprotect(p_out, "datetime")
print("Unprotected date: "+str(unprotected_output))
Result
Input date as a Date object : 2019-02-12
Protected date: 1154-10-29
Unprotected date: 2019-02-12
Example - Detokenizing Bulk Date Objects
The example for using the unprotect API for retrieving the original bulk date objects from the token data is described in this section.
If a date object is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date object in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
Example: Input as a Date Object
In this example, the 2019/02/12 and 2018/01/11 date strings are used as the data, which are first converted to a date objects using the Python date method of the datetime module. The two date objects are then used to create a list, which is used as the input data.
The input list is then tokenized using the datetime data element, and then detokenized using the same data element.
from appython import Protector
from datetime import datetime
protector = Protector()
session = protector.create_session("superuser")
data1 = datetime.strptime("2019/02/12", "%Y/%m/%d").date()
data2 = datetime.strptime("2018/01/11", "%Y/%m/%d").date()
data = [data1, data2]
print("Input data: "+str(data))
p_out = session.protect(data, "datetime")
print("Protected data: "+str(p_out))
unprotected_output = session.unprotect(p_out[0], "datetime")
print("Unprotected date: "+str(unprotected_output))
Result
Input data: [datetime.date(2019, 2, 12), datetime.date(2018, 1, 11)]
Protected data: ([datetime.date(1154, 10, 29), datetime.date(1543, 1, 5)], (6, 6))
Unprotected date: ([datetime.date(2019, 2, 12), datetime.date(2018, 1, 11)], (8, 8))
- 6 is the success return code for the protect operation of each element in the list.
- 8 is the success return code for the unprotect operation of each element in the list.
reprotect
The reprotect API reprotects data using tokenization, data type preserving encryption, No Encryption, or encryption data element. The protected data is first unprotected and then protected again with a new data element. It supports bulk protection without a maximum data limit. However, you are recommended not to pass more than 1 MB of input data for each protection call.
For String and Byte data types, the maximum length for tokenization is 4096 bytes, while no maximum length is defined for encryption.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used Alpha-Numeric data element to protect the data, then you must use only Alpha-Numeric data element to reprotect the data.
def reprotect(self, data, old_de, new_de, **kwargs)
Note: Do not pass the
selfparameter while invoking the API.
Parameters
data: Protected data to be reprotected. The data is first unprotected with the old data element and then protected with the new data element.
old_de: String containing the data element name defined in the policy for the input data. This data element is used to unprotect the protected data as part of the reprotect operation.
new_de: String containing the data element name defined in the policy to create the output data. This data element is used to protect the data as part of the reprotect operation.
**kwargs: Specify one or more of the following keyword arguments:
- old_external_iv: Specify the old external IV in bytes for Tokenization. This old external IV is used to unprotect the protected data as part of the reprotect operation. This argument is optional.
- new_external_iv: Specify the new external IV in bytes for Tokenization. This new external IV is used to protect the data as part of the reprotect operation. This argument is optional.
- encrypt_to: Specify this argument for re-encrypting the bytes data and set its value to bytes. This argument is Mandatory. This argument must not be used for Tokenization.
- charset: This is an optional argument. It indicates the byte order of the input buffer. You can specify a value for this argument from the charset constants, such as, UTF8, UTF16LE, or UTF16BE. The default value for the charset argument is UTF8.
The charset argument is only applicable for the input data of byte type.
The charset parameter is mandatory for the data elements created with Unicode Gen2 tokenization method for byte APIs. The encoding set for the charset parameter must match the encoding of the input data passed.
Keyword arguments are case sensitive.
Returns
- For single data: Returns the reprotected data
- For bulk data: Returns a tuple of the following data:
- List or tuple of the reprotected data
- Tuple of error codes
Exceptions
InvalidSessionError: This exception is thrown if the session is invalid or has timed out.
ProtectError: This exception is thrown if the API is unable to protect the data.
If the reprotect API is used with bulk data, then it does not throw any exception. Instead, it only returns an error code.
For more information about the return codes, refer to Application Protector API Return Codes.
Example - Retokenizing String Data
The examples for using the reprotect API for retokenizing string data are described in this section.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
Example 1: Input string data
In the following example, the Protegrity1 string is used as the input data, which is first tokenized using the string data element.
The tokenized input data, the old data element string, and a new data element string are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("Protegrity1", "string")
print("Protected Data: %s" %output)
r_out = session.reprotect(output, "string", "address")
print("Reprotected Data: %s" %r_out)
Result
Protected Data: 4l0z9SQrhtk
Reprotected Data: hFReRmrqzzB
Example 2: Input date passed as a string
In the following example, the 2019/02/14 string is used as the input data, which is first tokenized using the datetime data element.
If a date string is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date string in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
The tokenized input data, the old data element datetime, and a new data element
datetime are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("2019/02/14", "datetime")
print("Protected data: "+str(output))
r_out = session.reprotect(output, "datetime", "datetime_yc")
print("Reprotected data: "+str(r_out))
Result
Protected data: 1072/07/29
Reprotected data: 2019/07/13
Example 3: Input date and time passed as a string
In the following example, the 2019/02/14 10:54:47 string is used as the input data, which is first tokenized using the datetime data element.
If a date and time string is provided as input, then the data element with the same tokenization type as the input format must be used for data protection. For example, if the input date and time string in YYYY/MM/DD HH:MM:SS MMM format is provided, then only the Datetime (YYYY-MM-DD HH:MM:SS MMM) data element must be used to protect the data.
The tokenized input data, the old data element datetime, and a new data element datetime are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect("2019/02/14 10:54:47", "datetime")
print("Protected data: "+str(output))
r_out = session.reprotect(output, "datetime", "datetime_yc")
print("Reprotected data: "+str(r_out))
Result
Protected data: 1072/07/29 10:54:47
Reprotected data: 2019/07/13 10:54:47
Example 4: Retokenizing Unicode Data as String
In the following example, the ‘protegrity1234ÀÁÂÃÄÅÆÇÈÉ’ unicode data is used as the input data, which is first tokenized using the string data element.
The tokenized input data, the old data element string, and a new data element string are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect('protegrity1234ÀÁÂÃÄÅÆÇÈÉ', "string")
print("Protected Data: %s" %output)
r_out = session.reprotect(output, "string", "address")
print("Reprotected Data: %s" %r_out)
Result
Protected Data: VSYaLoLxo8GMyqÀÁÂÃÄÅÆÇÈÉ
Reprotected Data: sOcSzhEwXTrclwÀÁÂÃÄÅÆÇÈÉ
Example - Retokenizing String Data with External IV
The example for using the reprotect API for retokenizing string data using external IV is described in this section.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
If you want to pass the external IV as a keyword argument to the reprotect API, then you must pass the external IV as bytes to the API.
Example
In the following example, the Protegrity1 string is used as the input data, which is first tokenized using the string data element, with the help of external IV 1234 that is passed as bytes.
The tokenized input data, the string data element, the old external IV 1234 in bytes, and a new external IV 123456 in bytes are then passed as inputs to the reprotect API. As part of a single reprotect operation, the reprotect API first detokenizes the protected input data using the given data element and old external IV. It then retokenizes the data using the same data element, but with the new external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
p_out = session.protect("Protegrity1", "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %p_out)
r_out = session.reprotect(p_out, "string",
"string", old_external_iv=bytes("1234", encoding="utf-8"),
new_external_iv=bytes("123456", encoding="utf-8"))
print("Reprotected Data: %s" %r_out)
Result
Protected Data: oEquECC2JYb
Reprotected Data: m6AROToSQ71
Example - Retokenizing Bulk String Data
The examples for using the reprotect API for retokenizing bulk string data are described in this section. The bulk string data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
Example 1: Input bulk string data
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data, which is tokenized using the string data element.
The tokenized input data, the old data element string, and a new data element string are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
r_out = session.reprotect(p_out[0], "string", "address")
print("Reprotected Data: ")
print(r_out)
Result
Protected Data:
(['VSYaLoLxo8GMyq', '4l0z9SQrhtk', '9xP5wBuXJuce'], (6, 6, 6))
Reprotected Data:
(['sOcSzhEwXTrclw', 'hFReRmrqzzB', 'imoJL6U4mWPk'], (50, 50, 50))
6 is the success return code for the protect operation of each element in the list.
Example 2: Input date passed as bulk strings
In the following example, the 2019/02/14 and 2018/03/11 strings are stored in a list and used as bulk data, which is tokenized using the datetime data element.
If a date string is provided as input, then the data element with the same tokenization type as the input date format must be used to protect the data. For example, if you have provided the input date string in YYYY/MM/DD format, then you must use only the Date (YYYY/MM/DD) data element to protect the data.
The tokenized input data, the old data element datetime, and a new data element datetime are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["2019/02/14", "2018/03/11"]
output = session.protect(data, "datetime")
print("Protected data: "+str(output))
r_out = session.reprotect(output[0], "datetime", "datetime_yc")
print("Reprotected data: "+str(r_out))
Result
Protected data: (['1072/07/29', '0907/12/30'], (6, 6))
Reprotected data: (['2019/07/13', '2018/12/14'], (50, 50))
- 6 is the success return code for the protect operation of each element in the list.
- 50 is the success return code for the reprotect operation of each element in the list.
Example 3: Input date and time passed as bulk strings
In the following example, the 2019/02/14 10:54:47 and 2019/11/03 11:01:32 strings is used as the data, which is tokenized using the datetime Datetime data element.
If a date and time string is provided as input, then the data element with the same tokenization type as the input format must be used for data protection. For example, if you have provided the input date and time string in YYYY-MM-DD
HH:MM:SS MMM format, then you must use only the Datetime (YYYY-MM-DD HH:MM:SS MMM) data element to protect the data.
The tokenized input data, the old data element datetime, and a new data element datetime are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["2019/02/14 10:54:47", "2019/11/03 11:01:32"]
output = session.protect(data, "datetime")
print("Protected data: "+str(output))
r_out = session.reprotect(output[0], "datetime", "datetime_yc")
print("Reprotected data: "+str(r_out))
Result
Protected data: (['1072/07/29 10:54:47', '2249/12/17 11:01:32'], (6, 6))
Reprotected data: (['2019/07/13 10:54:47', '2019/05/29 11:01:32'], (50, 50))
6 is the success return code for the protect operation of each element in the list.
Example - Retokenizing Bulk String Data with External IV
The example for using the reprotect API for retokenizing bulk string data using external IV is described in this section. The bulk string data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
If you want to pass the external IV as a keyword argument to the reprotect API, then you must pass the external IV as bytes to the API.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are stored in a list and used as bulk data, which is tokenized using the string data element, with the help of external IV 123 that is passed as bytes.
The tokenized input data, the string data element, the old external IV 1234 in bytes, and a new external IV 123456 in bytes are then passed as inputs to the reprotect API. As part of a single reprotect operation, the reprotect API first detokenizes the protected input data using the given data element and old external IV, and then retokenizes it using the same data
element, but with the new external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = ["protegrity1234", "Protegrity1", "Protegrity56"]
p_out = session.protect(data, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: ")
print(p_out)
r_out = session.reprotect(p_out[0], "string","string",
old_external_iv=bytes("1234", encoding="utf-8"),
new_external_iv=bytes("123456", encoding="utf-8"))
print("Reprotected Data: ")
print(r_out)
Result
Protected Data:
(['aCzyqwijkSDqiG', 'oEquECC2JYb', 't0Ly7KYx7Wyo'], (6, 6, 6))
Reprotected Data:
(['EqDxRW2QhMqZJV', 'm6AROToSQ71', 'DTWuFfYK2ZpL'], (50, 50, 50))
6 is the success return code for the protect operation of each element in the list.
Example - Retokenizing Integer Data
The example for using the reprotect API for retokenizing integer data is described in this section.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used Integer data element to protect the data, then you must use only Integer data element to reprotect the data.
Example
In the following example, 21 is used as the input integer data, which is first tokenized using the int data element.
The tokenized input data, the old data element int, and a new data element int are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
output = session.protect(21, "int")
print("Protected Data: %s" %output)
r_out = session.reprotect(output, "int", "int")
print("Reprotected Data: %s" %r_out)
Result
Protected Data: -94623223
Reprotected Data: -94623223
Example - Retokenizing Integer Data with External IV
The example for using the reprotect API for retokenizing integer data using external IV is described in this section.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Integer data element to protect the data, then you must use only the Integer data element to reprotect the data.
If you want to pass the external IV as a keyword argument to the reprotect API, then you must pass the external IV as bytes to the API.
The AP Python APIs support integer values only between -2147483648 and 2147483648, both inclusive.
Example
In the following example, 21 is used as the input integer data, which is first tokenized using the int data element, with the help of external IV 1234 that is passed as bytes.
The tokenized input data, the int data element, the old external IV 1234 in bytes, and a new external IV 123456 in bytes are then passed as inputs to the reprotect API. As part of a single reprotect operation, the reprotect API first detokenizes the protected input data using the given data element and old external IV. It then retokenizes the data using the same data element, but with the new external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
p_out = session.protect(21, "int",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %p_out)
r_out = session.reprotect(p_out, "int", "int",
old_external_iv=bytes("1234", encoding="utf-8"), new_external_iv=bytes("123456", encoding="utf-8"))
print("Reprotected Data: %s" %r_out)
Result
Protected Data: 1983567415
Reprotected Data: 16592685
Example - Retokenizing Bulk Integer Data
The example for using the reprotect API for retokenizing bulk integer data is described in this section. The bulk integer data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Integer data element to protect the data, then you must use only the Integer data element to reprotect the data.
Example
In the following example, 21, 42, and 55 integers are stored in a list and used as bulk data, which is tokenized using the int data element.
The tokenized input data, the old data element int, and a new data element int are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "int")
print("Protected Data: ")
print(p_out)
r_out = session.reprotect(p_out[0], "int", "int")
print("Reprotected Data: ")
print(r_out)
Result
Protected Data:
([-94623223, -572010955, 2021989009], (6, 6, 6))
Reprotected Data:
([-94623223, -572010955, 2021989009], (50, 50, 50))
6 is the success return code for the protect operation of each element in the list.
Example - Retokenizing Bulk Integer Data with External IV
The example for using the reprotect API for retokenizing bulk integer data using external IV is described in this section. The bulk integer data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Integer data element to protect the data, then you must use only the Integer data element to reprotect the data.
If you want to pass the external IV as a keyword argument to the reprotect API, then you must pass the external IV as bytes to the API.
Example
In the following example, 21, 42, and 55 integers are stored in a list and used as bulk data, which is tokenized using the int data element, with the help of external IV 1234 that is passed as bytes.
The tokenized input data, the int data element, the old external IV 1234 in bytes, and a new external IV 123456 in bytes are then passed as inputs to the reprotect API. As part of a single reprotect operation, the reprotect API first detokenizes the protected input data using the given data element and old external IV. It then retokenizes the data using the same data element, but with the new external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [21, 42, 55]
p_out = session.protect(data, "int", external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: ")
print(p_out)
r_out = session.reprotect(p_out[0], "int", "int",
old_external_iv=bytes("1234", encoding="utf-8"), new_external_iv=bytes("123456", encoding="utf-8"))
print("Reprotected Data: ")
print(r_out)
Result
Protected Data:
([1983567415, -1471024670, 1465229692], (6, 6, 6))
Reprotected Data:
([16592685, -2026434677, 262981938], (50, 50, 50))
6 is the success return code for the protect operation of each element in the list.
Example - Retokenizing Bytes Data
The example for using the reprotect API for retokenizing bytes data is described in this section.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
Example
In the following example, Protegrity1 string is first converted to bytes using the Python bytes() method. The bytes data is then tokenized using the string data element.
The tokenized input data, the old data element string, and a new data element string are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "string")
print("Protected Data: %s" %p_out)
r_out = session.reprotect(p_out, "string", "address")
print("Reprotected Data: %s" %r_out)
Result
Protected Data: b'4l0z9SQrhtk'
Reprotected Data: b'hFReRmrqzzB'
In the following example, Protegrity1 string is first converted to bytes using the Python bytes() method. The bytes data is then tokenized using the string data element.
The tokenized input data, the old data element string, and a new data element string are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
from appython import Charset
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-16be")
p_out = session.protect(data, "string", encrypt_to=bytes, charset=Charset.UTF16BE)
print("Protected Data: %s" %p_out)
r_out = session.reprotect(p_out, "string", "string", encrypt_to=bytes, charset=Charset.UTF16BE)
print("Reprotected Data: %s" %r_out)
Result
Protected Data: b'\x004\x00l\x000\x00z\x009\x00S\x00Q\x00r\x00h\x00t\x00k'
Reprotected Data: b'\x004\x00l\x000\x00z\x009\x00S\x00Q\x00r\x00h\x00t\x00k'
Example - Retokenizing Bytes Data with External IV
The example for using the reprotect API for retokenizing bytes data using external IV is described in this section.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
Example
In the following example, Protegrity1 string is first converted to bytes using the Python bytes() method. The bytes data is then tokenized using the string data element, with the help of external IV 1234 that is passed as bytes.
The tokenized input data, the string data element, the old external IV 1234 in bytes, and a new external IV 123456 in bytes are then passed as inputs to the reprotect API. As part of a single reprotect operation, the reprotect API first detokenizes the protected input data using the given data element and old external IV, and then retokenizes it using the same data
element, but with the new external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: %s" %p_out)
r_out = session.reprotect(p_out, "string",
"string", old_external_iv=bytes("1234", encoding="utf-8"),
new_external_iv=bytes("123456", encoding="utf-8"))
print("Reprotected Data: %s" %r_out)
Result
Protected Data: b'oEquECC2JYb'
Reprotected Data: b'm6AROToSQ71'
Example - Re-Encrypting Bytes Data
The example for using the reprotect API for re-encrypting bytes data is described in this section.
If you are using the reprotect API, then the old data element and the new data element must be of the same protection method. For example, if you have used the text data element to protect the data, then you must use only the text data element to reprotect the data.
Example
In the following example, Protegrity1 string is first converted to bytes using the Python bytes() method. The bytes data is then encrypted using the text data element. Therefore, the encrypt_to parameter is passed as a keyword argument, and its value is set to bytes.
The encrypted input data, the old data element text, and a new data element text are then passed as inputs to the reprotect API. The reprotect API first decrypts the protected input data using the old data element and then re-encrypts it using the new data element, as part of a single reprotect operation. Therefore, the encrypt_to parameter is passed as a keyword argument, and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data=bytes("Protegrity1", encoding="utf-8")
p_out = session.protect(data, "text", encrypt_to = bytes)
print("Encrypted Data: %s" %p_out)
r_out = session.reprotect(p_out, "text", "text", encrypt_to = bytes)
print("Re-encrypted Data: %s" %r_out)
Result
Encrypted Data: b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V'
Re-encrypted Data: b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V'
Example - Retokenizing Bulk Bytes Data
The example for using the reprotect API for retokenizing bulk bytes data is described in this section. The bulk bytes data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is tokenized using the string data element.
The tokenized input data, the old data element string, and a new data element
string are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234","utf-8"), bytes("Protegrity1","utf-8"), bytes("Protegrity56","utf-8")]
p_out = session.protect(data, "string")
print("Protected Data: ")
print(p_out)
r_out = session.reprotect(p_out[0], "string", "address")
print("Reprotected Data: ")
print(r_out)
Result
Protected Data:
([b'VSYaLoLxo8GMyq', b'4l0z9SQrhtk', b'9xP5wBuXJuce'], (6, 6, 6))
Reprotected Data:
([b'sOcSzhEwXTrclw', b'hFReRmrqzzB', b'imoJL6U4mWPk'], (50, 50, 50))
- 6 is the success return code for the protect operation of each element in the list.
- 50 is the success return code for the reprotect operation of each element in the list.
Example - Retokenizing Bulk Bytes Data with External IV
The example for using the reprotect API for retokenizing bulk bytes data using external IV is described in this section. The bulk bytes data can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Alpha-Numeric data element to protect the data, then you must use only the Alpha-Numeric data element to reprotect the data.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is tokenized using the string data element,
with the help of external IV 1234 that is passed as bytes.
The tokenized input data, the string data element, the old external IV 1234 in bytes, and a new external IV 123456 in bytes are then passed as inputs to the reprotect API. As part of a single reprotect operation, the reprotect API first detokenizes the protected input data using the given data element and old external IV. It then retokenizes the data using the same data element, but with the new external IV.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding="utf-8"), bytes("Protegrity1",
encoding="utf-8"), bytes("Protegrity56", encoding="utf-8")]
p_out = session.protect(data, "string",
external_iv=bytes("1234", encoding="utf-8"))
print("Protected Data: ")
print(p_out)
r_out = session.reprotect(p_out[0], "string",
"string", old_external_iv=bytes("1234", encoding="utf-8"),
new_external_iv=bytes("123456", encoding="utf-8"))
print("Reprotected Data: ")
print(r_out)
Result
Protected Data:
([b'aCzyqwijkSDqiG', b'oEquECC2JYb', b't0Ly7KYx7Wyo'], (6, 6, 6))
Reprotected Data:
([b'EqDxRW2QhMqZJV', b'm6AROToSQ71', b'DTWuFfYK2ZpL'], (50, 50, 50))
6 is the success return code for the protect operation of each element in the list.
Example - Re-Encrypting Bulk Bytes Data
The example for using the reprotect API for re-encrypting bulk bytes data is described in this section. The bulk bytes data canbe passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are using the reprotect API, then the old data element and the new data element must be of the same protection method. For example, if you have used the text data element to protect the data, then you must use only the text data element to reprotect the data.
To avoid data corruption, do not convert the encrypted bytes data into string format. It is recommended that you to convert the encrypted bytes data to a Hexadecimal, Base 64, or any other appropriate format.
Example
In the following example, protegrity1234, Protegrity1, and Protegrity56 strings are first converted to bytes using the Python bytes() method. The converted bytes are then stored in a list and used as bulk data, which is encrypted using the text data element. Therefore, the encrypt_to parameter is passed as a keyword argument, and its value is set to bytes.
The encrypted input data, the old data element text, and a new data element text are then passed as inputs to the reprotect API. The reprotect API first decrypts the protected input data using the old data element and then re-encrypts it using the new data element, as part of a single reprotect operation. Therefore, the encrypt_to parameter is passed as a keyword argument, and its value is set to bytes.
from appython import Protector
protector = Protector()
session = protector.create_session("superuser")
data = [bytes("protegrity1234", encoding ="UTF-8"), bytes("Protegrity1", encoding
="UTF-8"), bytes("Protegrity56", encoding ="UTF-8")]
p_out = session.protect(data, "text", encrypt_to = bytes)
print("Encrypted Data: ")
print(p_out)
r_out = session.reprotect(p_out[0], "text", "text", encrypt_to = bytes)
print("Re-encrypted Data: ")
print(r_out)
Result
Encrypted Data:
([b"I\xc1\xf0S\x0f\xaf\t\x06\xb5;\xb5'%\xab\x9b\x18", b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V', b'\xfd\x99\xa7\xd1V(\x02K\xc9\xbdZ\x97\xd6\xea\xcc\x13'], (6, 6, 6))
Re-encrypted Data:
([b"I\xc1\xf0S\x0f\xaf\t\x06\xb5;\xb5'%\xab\x9b\x18", b'\x84\x84\xaf\x10fwh\xd7w\x06)`"p\xe0V', b'\xfd\x99\xa7\xd1V(\x02K\xc9\xbdZ\x97\xd6\xea\xcc\x13'], (50, 50, 50))
Example - Retokenizing Date Objects
The example for using the reprotect API for retokenizing date objects is described in this section.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Date (YYYY/MM/DD) data element to protect the data, then you must use only the Date (YYYY/MM/DD) data element to reprotect the data.
Example: Input as a data object
In the following example, the 2019/02/12 date string is used as the data, which is first converted to a date object using the Python date method of the datetime module. The date object is then tokenized using the datetime data element.
The tokenized input data, the old data element datetime, and a new data element datetime are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
from datetime import datetime
protector = Protector()
session = protector.create_session("superuser")
data = datetime.strptime("2019/02/12", "%Y/%m/%d").date()
print("Input date as a Date object : "+str(data))
p_out = session.protect(data, "datetime")
print("Protected date: "+str(p_out))
r_out = session.reprotect(p_out, "datetime", "datetime_yc")
print("Reprotected date: "+str(r_out))
Result
Input date as a Date object : 2019-02-12
Protected date: 1154-10-29
Reprotected date: 2019-02-03
Example - Retokenizing Bulk Date Objects
The example for using the reprotect API for retokenizing bulk date objects is described in this section. The bulk date objects can be passed as a list or a tuple.
The individual elements of the list or tuple must be of the same data type.
If you are retokenizing the data using the reprotect API, then the old data element and the new data element must have the same tokenization type. For example, if you have used the Date (YYYY/MM/DD) data element to protect the data, then you must use only the Date (YYYY/MM/DD) data element to reprotect the data.
Example: Input as a Date Object
In the following example, the 2019/02/12 and 2018/01/11 date strings are used as the data, which are first converted to a date objects using the Python date method of the datetime module. The two date objects are then used to create a list, which is used as the input data.
The input list is then tokenized using the datetime data element.
The tokenized input data, the old data element datetime, and a new data element datetime are then passed as inputs to the reprotect API. The reprotect API detokenizes the protected input data using the old data element and then retokenizes it using the new data element, as part of a single reprotect operation.
from appython import Protector
from datetime import datetime
protector = Protector()
session = protector.create_session("superuser")
data1 = datetime.strptime("2019/02/12", "%Y/%m/%d").date()
data2 = datetime.strptime("2018/01/11", "%Y/%m/%d").date()
data = [data1, data2]
print("Input data: ", str(data))
p_out = session.protect(data, "datetime")
print("Protected data: "+str(p_out))
r_out = session.reprotect(p_out[0], "datetime", "datetime_yc")
print("Reprotected date: "+str(r_out))
Result
Input data: [datetime.date(2019, 2, 12), datetime.date(2018, 1, 11)]
Protected data: ([datetime.date(1154, 10, 29), datetime.date(1543, 1, 5)], (6, 6))
Reprotected date: ([datetime.date(2019, 2, 3), datetime.date(2018, 11, 14)], (50, 50))
- 6 is the success return code for the protect operation of each element in the list.
- 50 is the success return code for the reprotect operation of each element in the list.
Log return codes for Protectors
The log codes and the descriptions help you understand the reason for the code and is useful during troubleshooting.
| Return Code | Description |
|---|---|
| 0 | Error code for no logging |
| 1 | The username could not be found in the policy |
| 2 | The data element could not be found in the policy |
| 3 | The user does not have the appropriate permissions to perform the requested operation |
| 5 | Integrity check failed |
| 6 | Data protect operation was successful |
| 7 | Data protect operation failed |
| 8 | Data unprotect operation was successful |
| 9 | Data unprotect operation failed |
| 10 | The user has appropriate permissions to perform the requested operation but no data has been protected/unprotected |
| 11 | Data unprotect operation was successful with use of an inactive keyid |
| 12 | Input is null or not within allowed limits |
| 13 | Internal error occurring in a function call after the provider has been opened |
| 14 | Failed to load data encryption key |
| 20 | Failed to allocate memory |
| 21 | Input or output buffer is too small |
| 22 | Data is too short to be protected/unprotected |
| 23 | Data is too long to be protected/unprotected |
| 26 | Unsupported algorithm or unsupported action for the specific data element |
| 27 | Application has been authorized |
| 28 | Application has not been authorized |
| 31 | Policy not available |
| 44 | The content of the input data is not valid |
| 49 | Unsupported input encoding for the specific data element |
| 50 | Data reprotect operation was successful |
| 51 | Failed to send logs, connection refused |
5 - Customizing the sample application
The steps mentioned in this section are optional. The sample application can run to detect and redact the data with the default configurations. These configurations are only required when a change is required in the way that the files are processed. For example, a change in the name of the input or output file.
Sample application customization
Specifying the source file
The source file contains the data that must be processed. This file can have a paragraph of text or a table with values. Protegrity AI Developer Edition can process various files. However, for security reasons, certain characters are not processed and rejected. To enable or disable these security settings, refer to the section Input Sanitization. This version of the release only supports files containing plain text.
To specify the source file:
Navigate to the location where Protegrity AI Developer Edition is cloned.
Open the
sample-app-find-and-redact.pyfile from thesamplesdirectory.Locate the following statement.
input_file = base_dir / "sample-data" / "input.txt"Update the path and name for the source file.
Save and close the file.
Run the Python file.
Specifying the output file
The output file location specifies where the processed output file must be stored.
To specify the source file:
Navigate to the location where Protegrity AI Developer Edition is cloned.
Open the
sample-app-find-and-redact.pyfile from thesamplesdirectory.Locate the following statement.
output_file = base_dir / "sample-data" / "output-redact.txt"Update the path and name for the output file.
Save and close the file.
Run the Python file.
Specifying the configuration settings
Use the config.json configuration file to specify the data that must be redacted or masked. The character that must be used for masking can also be specified.
Before you begin:
Identify the sensitive fields that are present in the source file.
Open a command prompt.
Navigate to the directory where the sample application is extracted.
Run the following command.
python samples/sample-app-find.pyView the list of sensitive items. For a complete list of items that can be identified, refer to the List of items.
Updating the configuration file.
Navigate to the location where Protegrity AI Developer Edition is cloned.
Open the
config.jsonfile.Specify the masking character to use in the following code.
"masking_char": "#"Specify the text to use for the redacted data in the
named_entity_mapparameter. The following code shows the value used for the sample source file."named_entity_map": { "USERNAME": "USERNAME", "STATE": "STATE", "PHONE_NUMBER": "PHONE", "SOCIAL_SECURITY_NUMBER": "SSN", "AGE": "AGE", "CITY": "CITY", "PERSON": "PERSON" }Specify the operation to perform on the source file. The available options are
maskandredact."method": "mask"Save and close the file.
Run the Python file.
Specifying the classification score threshold settings
The classification score threshold sets the minimum confidence level needed for the system to treat detected data as valid. It helps filter out uncertain matches so only high-confidence results are flagged. Adjust this threshold during setup. It is a value, such as, 0.6 for 60%. Lowering it makes the system more sensitive, while raising it reduces false positives.
To set the value:
Navigate to the location where Protegrity AI Developer Edition is cloned.
Open the
sample-app-find-and-redact.pyfile from thesamplesdirectory.Locate the following statement.
"classification_score_threshold", 0.6Set the required value.
Save and close the file.
Run the Python file.
Specifying the logging parameters
The log messages are sent to the terminal. To capture logging data, transfer and save the output of the commands to a log file.
To set the logging level:
Navigate to the location where Protegrity AI Developer Edition is cloned.
Open the
config.jsonfile.Locate or add the following statement.
"enable_logging": True, "log_level": "info",Ensure that logging is set to True and set the required log level that must be displayed.
Save and close the file.
Run the Python file.
Python module configuration
The following parameters are configurable for AI Developer Edition.
| Parameter | Description | Values | Example |
|---|---|---|---|
| endpoint_url | The Data Discovery endpoint for classifying sensitive data. | Specify a URL. | http://localhost:${CLASSIFICATION_PORT:-8580}/pty/data-discovery/v1.0/classify |
| named_entity_map | A dictionary or map of entities and their corresponding replacement names. | List of items | named_entity_map": { “PERSON”: “PERSON”,“PHONE_NUMBER”: “PHONE”} |
| masking_char | The character to be used for masking. | Specify a special character. | # |
| classification_score_threshold | The minimum confidence level needed for the system to treat detected data as valid. | Specify a number between 0 and 1.0 | 0.6 |
| method | The method for processing sensitive data. | redact or mask | mask |
| enable_logging | Specify whether to enable logging. | True or False | True |
6 - Building the Python module
The protegrity-developer-python repository is part of the Protegrity AI Developer Edition suite. This repository provides the Python module for integrating Protegrity’s Data Discovery and Protection APIs into GenAI and traditional applications.
Customize, compile, and use the module as per your requirement.
💡Note: This module should be built and used, only if you intend to change the source and default behavior. Ensure that the Protegrity AI Developer Edition is running before installing this module. For setup instructions, refer to the installation steps.
Prerequisites
- Git
- Python >= 3.12.11
- pip
- Python Virtual Environment
- Uninstall the
protegrity_developer_pythonmodule from the Python virtual environmentif it is already installed.pip uninstall protegrity_developer_python
Build the protegrity-developer-python module
- Clone the repository.
git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-python.git - Navigate to the
protegrity-developer-pythondirectory in the cloned location. - Optional: Update the files in the Python source directory as required.
- Activate the Python virtual environment.
- Install the dependencies.
pip install -r requirements.txt - Build and install the Python module by running the following command from the root directory of the repository.The installation completes and the success message is displayed.
pip install .
7 - Appendix
7.1 - Input Sanitization
The Classification service in Data Discovery offers a security feature that rejects unsanitized data. Data that is malformed, non-normalized, containing homoglyphs, hieroglyphs, mixed Unicode variants, or control characters are considered as unsanitized data. These are rejected for classification.
The following are few examples of data that will be rejected:
- Ⅷ
- 𝓉𝑒𝓍𝓉
- Pep
Before invoking the Classification endpoint, ensure that the input text is normalized. Replace invalid characters by their corresponding normalized plaintext characters. If the input text contains any invalid character, a status code of 422 and a message Untrusted input is returned.
For security purposes, the application rejects unsanitized data by default. It is recommended that this feature remains enabled. However, to override this feature, perform the following steps.
Navigate to the
docker_composedirectory.Edit the
docker-compose.yamlfile.Under the
environmentsection ofclassification_service, append the security parameter as follows.
- SECURITY_SETTINGS={"ENABLE_ALL_SECURITY_CONTROLS":false}
Save the changes.
Run the
docker compose downcommand to undeploy the application.Run the
docker compose upcommand to redeploy the application.
7.2 - Working with the Data Discovery containers
Use Data Discovery by setting up and deploying the containers.
7.2.1 - Understanding the Docker Compose File
The following variables can be configured in the docker-compose.yml file.
| Variable | Description | Mandatory |
|---|---|---|
| networks:name | Specify the name of the Docker network. | No |
| services:enviroment | Specify the location for the logs in the logging_config parameter. | No |
| classification_service:ports | Specify the listening port for the classification service. By default, the port is set to 8580. | No |
7.2.2 - Deploying the Application
Ensure that the prerequisites are completed before deploying the application.
Run the following steps to deploy the Data Discovery application on Docker.
Open a command prompt.
Navigate to the AI Developer Edition package directory.
Run the command to start the containers. For example, the following command starts the Classification service container.
docker compose up -d
7.3 - Supported Sensitive Entity Types
| Entity Name | Data Element | Description |
|---|---|---|
| ABA_ROUTING_NUMBER | number | Routing number used to identify financial institutions in the United States. |
| ACCOUNT_NAME | string | Name associated with a financial account. |
| ACCOUNT_NUMBER | number | Bank account number used to identify financial accounts. |
| AGE | number | Age information used to identify individuals. |
| AMOUNT | int | Specific amount of money, which can be linked to financial transactions. |
| AU_ABN | number | Australian Business Number used to identify businesses in Australia. |
| AU_ACN | number | Australian Company Number used to identify businesses in Australia. |
| AU_MEDICARE | number | Medicare number used to identify individuals for healthcare services in Australia. |
| AU_TFN | number | Tax File Number used to identify taxpayers in Australia. |
| BIC | number | Bank Identifier Code used to identify financial institutions. |
| BITCOIN_ADDRESS | address | Bitcoin wallet address used for digital transactions. |
| BUILDING | address | Building information used to identify specific locations. |
| CITY | city | City information used to identify geographic locations. |
| COMPANY_NAME | string | Name of a company used to identify businesses. |
| COUNTRY | string | Country information used to identify geographic locations. |
| COUNTY | string | County information used to identify geographic locations. |
| CREDIT_CARD | ccn | Credit card number used for financial transactions. |
| CREDIT_CARD_CVV | number | Card Verification Value used to secure credit card transactions. |
| CRYPTO | address | Cryptocurrency wallet address used for digital transactions. |
| CURRENCY | string | Currency information used in financial transactions. |
| CURRENCY_CODE | string | Code representing currency used in financial transactions. |
| CURRENCY_NAME | string | Name of currency used in financial transactions. |
| CURRENCY_SYMBOL | string | Symbol representing currency, sometimes linked to financial transactions. |
| DATE | datetime | Specific date that can be linked to personal activities. |
| DATE_OF_BIRTH | datetime | Date of birth used to identify individuals. |
| DATE_TIME | datetime | Specific date and time that can be linked to personal activities. |
| DRIVER_LICENSE | number | Driver’s license number used to identify individuals. |
| EMAIL_ADDRESS | Email address used for communication and identification. | |
| ES_NIE | nin | Foreigner Identification Number used to identify non-residents in Spain. |
| ES_NIF | nin | Tax Identification Number used to identify taxpayers in Spain. |
| ETHEREUM_ADDRESS | address | Ethereum wallet address used for digital transactions. |
| FI_PERSONAL_IDENTITY_CODE | nin | Personal identity code used to identify individuals in Finland. |
| GENDER | string | Gender information used to identify individuals. |
| GEO_CCORDINATE | address | Geographic coordinates used to identify specific locations. |
| IBAN_CODE | iban | International Bank Account Number used to identify bank accounts globally. |
| ID_CARD | number | Identity card number used to identify individuals. |
| IN_AADHAAR | nin | Unique identification number used to identify residents in India. |
| IN_PAN | number | Permanent Account Number used to identify taxpayers in India. |
| IN_PASSPORT | passport | Passport number used to identify individuals in India. |
| IN_VEHICLE_REGISTRATION | number | Vehicle registration number used to identify vehicles in India. |
| IN_VOTER | number | Voter ID number used to identify registered voters in India. |
| IP_ADDRESS | address | Internet Protocol address used to identify devices on a network. |
| IPV4 | address | IPv4 address used to identify devices on a network. |
| IPV6 | address | IPv6 address used to identify devices on a network. |
| IT_DRIVER_LICENSE | number | Driver’s license number used to identify individuals in Italy. |
| IT_FISCAL_CODE | nin | Fiscal code used to identify taxpayers in Italy. |
| IT_IDENTITY_CARD | number | Identity card number used to identify individuals in Italy. |
| IT_PASSPORT | passport | Passport number used to identify individuals in Italy. |
| LITECOIN_ADDRESS | address | Litecoin wallet address used for digital transactions. |
| LOCATION | address | Specific location or address that can be linked to an individual. |
| MAC | address | Media Access Control address used to identify devices on a network. |
| MEDICAL_LICENSE | number | License number used to identify medical professionals. |
| NRP | number | A person’s nationality, religious or political group. |
| ORGANIZATION | string | Name or identifier used to identify an organization. |
| PASSPORT | passport | Passport number used to identify individuals. |
| PASSWORD | string | Password used to secure access to personal accounts. |
| PERSON | string | Name or identifier used to identify an individual. |
| PHONE_NUMBER | phone | Number used to contact or identify an individual. |
| PIN | number | Personal Identification Number used to secure access to accounts. |
| PL_PESEL | nin | Personal Identification Number used to identify individuals in Poland. |
| SECONDARYADDRESS | address | Additional address information used to identify locations. |
| SG_NRIC_FIN | nin | National Registration Identity Card number used to identify residents in Singapore. |
| SG_UEN | number | Unique Entity Number used to identify businesses in Singapore. |
| SOCIAL_SECURITY_NUMBER | ssn | Social Security Number used to identify individuals. |
| STATE | string | State information used to identify geographic locations. |
| STREET | address | Street address used to identify specific locations. |
| TIME | datetime | Specific time that can be linked to personal activities. |
| TITLE | string | Title or honorific used to identify individuals. |
| UK_NHS | number | National Health Service number used to identify individuals for healthcare services in the United Kingdom. |
| URL | address | Web address that can sometimes contain personal information. |
| US_BANK_NUMBER | number | Bank account number used to identify financial accounts in the United States. |
| US_DRIVER_LICENSE | number | Driver’s license number used to identify individuals in the United States. |
| US_ITIN | number | Individual Taxpayer Identification Number used to identify taxpayers in the United States. |
| US_PASSPORT | passport | Passport number used to identify individuals in the United States. |
| US_SSN | ssn | Social Security Number used to identify individuals in the United States. |
| USERNAME | string | Username used to identify individuals in online systems. |
| ZIP_CODE | zipcode | Postal code used to identify specific geographic areas. |
7.4 - Data Security Policy
This section describes the Policy configuration used by the AI Developer Edition API Service.
The superuser has all permissions, that is, protect, unprotect, and reprotect operations. Users assigned the admin role will receive protected data when performing an unprotect operation, except in the case of the text data elements, which will return null. All other user roles will receive null as the output for any unprotect operation.
Policy Definition
Generic Data Elements
| Data Element | Method | Use Case | UTF Set | LP | PP | eIV | Role | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Admin | Finance | Marketing | HR | |||||||||||
| P | U | P | U | P | U | P | U | |||||||
| datetime | Tokenization | A date or datetime string. Formats accepted: YYYY/MM/DD HH:MM:SS and YYYY/MM/DD. Delimiters accepted: /, - (required). | N/A | N/A | N/A | No | ✓ | X | X | X | X | ✓ | X | X |
| datetime_yc | Tokenization | A date or datetime string. Formats accepted: YYYY/MM/DD HH:MM:SS and YYYY/MM/DD. Delimiters accepted: /, - (required). Leaves the year in the clear. | N/A | N/A | N/A | No | ✓ | X | X | X | X | ✓ | X | X |
| int | Tokenization | An integer string (4 bytes). | Numeric | No | No | Yes | ✓ | X | X | X | X | ✓ | X | X |
| number | Tokenization | A numeric string. May produce leading zeroes. | Numeric | Yes | No | Yes | ✓ | X | X | X | X | ✓ | X | X |
| string | Tokenization | An alphanumeric string. | Latin + Numeric | Yes | No | Yes | ✓ | X | X | X | X | ✓ | X | X |
| text | Encryption | A long string (e.g., a comment field) using any character set. Use hex or base64 encoding to utilize. | All | No | No | Yes | ✓ | X | X | X | X | ✓ | X | X |
PCI DSS Data Elements
| Data Element | Method | Use Case | UTF Set | LP | PP | eIV | Role | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Admin | Finance | Marketing | HR | |||||||||||
| P | U | P | U | P | U | P | U | |||||||
| ccn | Tokenization | Credit card numbers. | Numeric | No | No | Yes | ✓ | X | X | ✓ | X | X | X | ✓ |
| ccn_bin | Tokenization | Credit card numbers. Leaves 8-digit BIN in the clear. | Numeric | No | No | Yes | ✓ | X | X | ✓ | X | X | X | ✓ |
| iban | Tokenization | IBAN numbers. Preserves the length, case, and position of the input characters but may create invalid IBAN codes. | Latin + Numeric | Yes | Yes | No | ✓ | X | X | ✓ | X | X | X | ✓ |
| iban_cc | Tokenization | IBAN numbers. Leaves letters in the clear. | Latin + Numeric | No | No | Yes | ✓ | X | X | ✓ | X | X | X | ✓ |
PII Data Elements
| Data Element | Method | Use Case | UTF Set | LP | PP | eIV | Role | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Admin | Finance | Marketing | HR | |||||||||||
| P | U | P | U | P | U | P | U | |||||||
| address | Tokenization | Street names | Latin + Numeric | Yes | No | Yes | ✓ | X | X | ✓ | X | X | X | ✓ |
| city | Tokenization | Town or city name | Latin | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
| Tokenization | Email address. Leaves the domain in the clear. | Latin + Numeric | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ | |
| nin | Tokenization | National Insurance Number. Preserves the length, case, and position of the input characters but may create invalid NIN codes. | Latin + Numeric | Yes | Yes | No | ✓ | X | X | X | X | X | X | X |
| name | Tokenization | Person's name | Latin | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
| passport | Tokenization | Passport codes. Preserves the length, case, and position of the input characters but may create invalid passport numbers. | Latin + Numeric | Yes | Yes | No | ✓ | X | X | X | X | X | X | X |
| phone | Tokenization | Phone number. May produce leading zeroes. | Latin + Numeric | Yes | No | Yes | ✓ | X | X | X | X | X | X | X |
| postcode | Tokenization | Postal codes with digits and characters. Preserves the length, case, and position of the input characters but may create invalid post codes. | Latin + numeric | Yes | Yes | No | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
| ssn | Tokenization | Social Security Number (US) | Latin + Numeric | Yes | No | Yes | ✓ | X | X | X | X | X | X | X |
| zipcode | Tokenization | Zip codes with digits only. May produce leading zeroes. | Numeric | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
PII Data Elements
| Data Element | Method | Use Case | UTF Set | LP | PP | eIV | Role | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Admin | Finance | Marketing | HR | |||||||||||
| P | U | P | U | P | U | P | U | |||||||
| address_de | Tokenization | Street names (German) | Latin + German + Numeric | Yes | No | Yes | ✓ | X | X | ✓ | X | X | X | ✓ |
| address_fr | Tokenization | Street names (French) | Latin + French + Numeric | Yes | No | Yes | ✓ | X | X | ✓ | X | X | X | ✓ |
| city_de | Tokenization | Town or city name (German) | Latin + German | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
| city_fr | Tokenization | Town or city name (French) | Latin + French | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
| name_de | Tokenization | Person's name (German) | Latin + German | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
| name_fr | Tokenization | Person's name (French) | Latin + French | Yes | No | Yes | ✓ | X | X | ✓ | X | ✓ | X | ✓ |
LEGEND
- eIV: External IV
- LP: Length Preservation
- PP: Position Preservation
- P: User group can protect data
- U: User group can unprotect data
7.5 - Removing AI Developer Edition
Open a command prompt.
Navigate to the cloned repository location.
Run the following command to remove the containers and images.
docker compose down --rmi allRun the following command to remove the Python module.
pip uninstall protegrity-developer-python
7.6 - Known Issues
Issue: SSL errors in the Data Discovery container
Description: The tldextract tries to download the following public Suffix lists files:
- https://publicsuffix.org/list/public_suffix_list.dat
- https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat
When these lists cannot be downloaded, then the default files included in the package are used and no issue in observed in the classification.