Complete the prerequisites, optionally register for access to AI Developer Edition API Service, set up, verify, and run the required files for using Protegrity AI Developer Edition.
This is the multi-page printable view of this section. Click here to print.
Setting up AI Developer Edition
- 1: Prerequisites
- 2: Optional - Obtaining access to the AI Developer Edition API Service
- 3: Setting up the packages
- 4: Verifying the files in the Protegrity AI Developer Edition package
1 - Prerequisites
General requirements
The system requirements for the AI Developer edition are provided here.
Hardware requirements
For the local docker deployment mode, a machine with the following specifications will enable you to experiment with the main features:
- RAM: 16 GB
- CPU: 8 core
- GPU: 4 GB VRAM, for Synthetic Data only
- Hard Disk: 50GB available
For the local docker deployment mode, a machine with the following specifications will enable you to experiment with the main features:
- RAM: 16 GB
- CPU: 8 core
- GPU: 4 GB VRAM, for Synthetic Data only
- Hard Disk: 50GB available
For the local docker deployment mode, a machine with the following specifications will enable you to experiment with the main features:
- RAM: 16 GB
- CPU: 8 core
- GPU: 4 GB VRAM, for Synthetic Data only
- Hard Disk: 50GB available
Software requirements
- Python v3.12.11 and above is installed. For more information about installing Python, refer to the Python website. Ensure that the Python command points to a supported python3 version, for example, Python 3.12.11. Verify using the
python --versioncommand. - pip for installing packages.
- Python Virtual Environment.
- Docker CLI is installed to manage Docker containers.
- Docker Compose is installed for local containerized deployments. This application supports Docker Compose V2.30 and later. Ensure that your installation supports this version.
- Git is installed for cloning the repository.
- Java 11 or later.
- Maven 3.6+ for AP Java.
- Python v3.12.11 and above is installed. For more information about installing Python, refer to the Python website. Ensure that the Python command points to a supported python3 version, for example, Python 3.12. Verify using the
python --versioncommand. - pip for installing packages.
- Python Virtual Environment.
- Docker CLI is installed to manage Docker containers.
- Docker Compose is installed for local containerized deployments. This application supports Docker Compose V2.30 and later. Ensure that your installation supports this version.
- Git is installed for cloning the repository.
- Java 11 or later.
- Maven 3.6+ for AP Java.
- Python v3.12.11 and above is installed. For more information about installing Python, refer to the Python website. Ensure that the Python command points to a supported python3 version, for example, Python 3.12. Verify using the
python --versioncommand. - pip for installing packages.
- Python Virtual Environment.
- Docker Desktop or Colima is installed.
- Docker Compose is installed for local containerized deployments. This application supports Docker Compose V2.30 and later. Ensure that your installation supports this version.
- Git is installed for cloning the repository.
- Java 11 or later.
- Maven 3.6+ for AP Java.
Additional settings for macOS
macOS requires additional steps for Docker and for systems with Apple Silicon chips. Complete the following steps before using AI Developer Edition.
Complete one of the following options to apply the settings.
- For Colima:
- Open a command prompt.
- Run the following command.
colima start --vm-type vz --vz-rosetta --memory 8
- For Docker Desktop:
- Open Docker Desktop.
- Go to Settings > General.
- Enable the following check boxes:
- Use Virtualization framework
- Use Rosetta for x86_64/amd64 emulation on Apple Silicon
- Click Apply & restart.
- For Colima:
Update one of the following options for resolving certificate related errors.
- For Colima:
Open a command prompt.
Navigate and open the following file.
~/.colima/default/colima.yamlUpdate the following configuration in
colima.yamlto add the path for obtaining the required images.Before update:
docker: {}After update:
docker: insecure-registries: - ghcr.ioSave and close the file.
Stop colima.
colima stopClose and start the command prompt.
Start colima.
colima start --vm-type vz --vz-rosetta --memory 8
- For Docker Desktop:
Open Docker Desktop.
Click the gear or settings icon.
Click Docker Engine from the sidebar. The editor opens the current Docker daemon configuration
daemon.json.Locate and add the
insecure-registrieskey in the root JSON object. Ensure that a comma is added after the last value in the existing configuration.After update:
{ . . <existing configuration>, "insecure-registries": [ "ghcr.io", "githubusercontent.com" ] }Click Apply & Restart to save the changes and restart Docker Desktop.
Verify: After Docker restarts, run
docker infoin your terminal and confirm that the required registry is listed under Insecure Registries.
- For Colima:
Optional: If the The requested image’s platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested error is displayed.
Start a command prompt.
Navigate and open the following file.
~/.docker/config.jsonAdd the following paramater.
"default-platform": "linux/amd64"Save and close the file.
Run
docker compose up -dfrom theprotegrity-developer-editiondirectory if already cloned, else continue with the setup.
2 - Optional - Obtaining access to the AI Developer Edition API Service
Registration is only required for running the APIs to protect, unprotect, and reprotect data. The find and redact that uses Data Discovery, Semantic Guardrail, and Synthetic Data features can be used without registration. Skip this section if find and protect that uses the tokenization and encryption feature is not required.
Registering for access
Sign up for access to the AI Developer Edition API Service. This is required for obtaining access to use the APIs.
- Open a web browser.
- Navigate to https://www.protegrity.com/developers/dev-edition-api.
- Specify the following details:
- First Name
- Last Name
- Work Email
- Job Title
- Company Name
- Country
- Click the Terms & Conditions link and read the terms and conditions.
- Select the check box to accept the terms and conditions.
- Click Get Started.
The request is analyzed. After the request is approved, a password and API key to access the AI Developer Edition API Service is sent to the Work Email specified. If the account already exists, then the details are re-sent to the email address. The email takes a minute or two to arrive. If the email does not arrive in the specified email’s inbox, check the spam or junk folder first, before retrying.
Use the online Protegrity notebook with the credentials to test tokenization.
Specifying the authentication information
Add the login information provided by Protegrity to the environment to access the AI Developer Edition API Service.
Note: It is recommended to add the details to the environment variables to avoid specifying the information every time the environment is initialized.
- Open a command prompt.
- Initialize a Python virtual environment.
- Add the email address of the user.
export DEV_EDITION_EMAIL='<Email_used_for_registration>'
$env:DEV_EDITION_EMAIL = '<Email_used_for_registration>'
export DEV_EDITION_EMAIL='<Email_used_for_registration>'
- Specify the password provided in the registration email.
export DEV_EDITION_PASSWORD='<Password_provided_in_email>'
$env:DEV_EDITION_PASSWORD = '<Password_provided_in_email>'
export DEV_EDITION_PASSWORD='<Password_provided_in_email>'
- Specify the API key for accessing the AI Developer Edition API Service.
export DEV_EDITION_API_KEY='<API_key_provided_in_email>'
$env:DEV_EDITION_API_KEY = '<API_key_provided_in_email>'
export DEV_EDITION_API_KEY='<API_key_provided_in_email>'
- Verify that the variables are set.
test -n "$DEV_EDITION_EMAIL" && echo "EMAIL $DEV_EDITION_EMAIL set" || echo "EMAIL missing"
test -n "$DEV_EDITION_PASSWORD" && echo "PASSWORD $DEV_EDITION_PASSWORD set" || echo "PASSWORD missing"
test -n "$DEV_EDITION_API_KEY" && echo "API KEY $DEV_EDITION_API_KEY set" || echo "API KEY missing"
if ($env:DEV_EDITION_EMAIL) { Write-Output "EMAIL $env:DEV_EDITION_EMAIL set"} else { Write-Output "EMAIL missing"}
if ($env:DEV_EDITION_PASSWORD) { Write-Output "PASSWORD $env:DEV_EDITION_PASSWORD set" } else { Write-Output "PASSWORD missing" }
if ($env:DEV_EDITION_API_KEY) { Write-Output "API KEY $env:DEV_EDITION_API_KEY set" } else { Write-Output "API KEY missing" }
test -n "$DEV_EDITION_EMAIL" && echo "EMAIL $DEV_EDITION_EMAIL set" || echo "EMAIL missing"
test -n "$DEV_EDITION_PASSWORD" && echo "PASSWORD $DEV_EDITION_PASSWORD set" || echo "PASSWORD missing"
test -n "$DEV_EDITION_API_KEY" && echo "API KEY $DEV_EDITION_API_KEY set" || echo "API KEY missing"
AI Developer Edition API Service usage guidelines
To ensure fair use of the API service, a rate limit is enforced on API requests to the AI Developer Edition API Service.
These limits are:
- Request rate: 50 per second
- Burst: up to 100
- Quota: 10,000 requests per user per day
- Maximum payload size: 1MB
3 - Setting up the packages
Obtaining the package
Navigate to the Protegrity AI Developer Edition repository.
Clone or download the repositories on your local system.
- protegrity-developer-edition: Contains the files to launch the required containers. It also contains the sample applications and files.
git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-edition.gitTo customize the Python modules, clone and use the source from the protegrity-developer-python repository.
git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-python.gitTo customize the Java libraries, clone and use the source from the protegrity-developer-java repository.
git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-java.gitVerify the files in the package. The list of files in the git package can be obtained from the files list.
Back up the Protegrity AI Developer Edition repository if the Python and configuration files are updated.
Note: The supported entites are updated. For more information about the entites, refer to Supported Entites.
Navigate to the cloned repository location for protegrity-developer-edition.
Run the following command to stop the containers.
docker compose downBased on your configuration use the
docker-compose downcommand.Sync to update the repositories on the local system using the
git pullcommand.- protegrity-developer-edition: Contains the files to launch the required containers. It also contains the sample applications and files.
- protegrity-developer-python: Contains the source files for customizing and using the Python module.
- protegrity-developer-java: Contains the source files for customizing and using the Java library.
Verify the files in the package. The list of files in the git package can be obtained from the files list.
Setting up Data Discovery, Semantic Guardrail, and Synthetic Data
The containers contain the Data Discovery and Semantic Guardrails components required for identifying sensitive data. It also contains the Synthetic Data component for data generation.
Open a command prompt.
Navigate to the cloned repository location for protegrity-developer-edition.
Run the following command to download and start the containers. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy.
To start all the features.
docker compose --profile synthetic up -dTo start only the Data Discovery and Semantic Guardrails features.
docker compose up -dBased on your configuration use the
docker-compose up -dcommand. Ensure that you bring down the containers usingdocker compose --profile synthetic downordocker compose downbefore switching between starting all containers or Data Discovery and Semantic Guardrails containers.Verify that the containers started successfully.
docker compose logsSet up the Jupyter notebook for working with the notebooks provided from the cloned repository location for protegrity-developer-edition.
pip install -r samples/python/requirements.txt
Open a command prompt.
Navigate to the cloned repository location for protegrity-developer-edition.
If the step to stop containers was missed earlier, then use the following commands to identify and remove the AI Developer Edition containers.
docker compose down --remove-orphansDelete the docker network resources.
docker network rm -f <network_name_or_id>For example,
docker network rm -f protegrity-networkRun the following command to download and start the containers. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy.
To start all the features.
docker compose --profile synthetic up -dTo start only the Data Discovery and Semantic Guardrails features.
docker compose up -dBased on your configuration use the
docker-compose up -dcommand. Ensure that you bring down the containers usingdocker compose --profile synthetic downordocker compose downbefore switching between starting all containers or Data Discovery and Semantic Guardrails containers.Verify that the containers started successfully.
docker compose logsSet up the Jupyter notebook for working with the notebooks provided from the cloned repository location for protegrity-developer-edition.
pip install -r samples/python/requirements.txt
Installing the protegrity-developer-python module
The module has built-in functions to find, redact, mask, and protect data.
Open a command prompt.
Install the protegrity-developer-python module. It is recommended to install and activate the Python virtual environment before running this command.
pip install protegrity-developer-pythonThe installation completes and the success message is displayed. To compile and install the Python module from source, refer to Building the Python module.
Open a command prompt.
Upgrade the protegrity-developer-python module. It is recommended to install and activate the Python virtual environment before running the command.
pip install --upgrade protegrity-developer-pythonThe package is successfully upgraded.
Installing the protegrity-developer-java library
When you run the Java samples for the first time, Maven automatically pulls the protegrity-developer-java library from Maven Central as a dependency. This ensures that all required classes and resources are available without manual download.
4 - Verifying the files in the Protegrity AI Developer Edition package
protegrity-developer-edition repository
This repository contains the files for obtaining and running the sample application.
- CHANGELOG.md: It tracks version updates and changes are tracked here.
- CONTRIBUTIONS.md: The guidelines for contributing to the project.
- LICENSE: The license file with the terms and conditions for using the application.
- README.md: The readme file specifying the steps to set up the product.
- docker-compose.yml: This file contains the configuration for deploying the containers.
- data-discovery: The directory with the sample application and scripts for Data Discovery.
- semantic-guardrail: The directory with the sample scripts for Semantic Guardrail.
- samples: The directory with the sample application and scripts for the testing the features.
- java: The directory with the sample scripts for testing the features using Java.
- python: The directory with the sample scripts for testing the features using Python.
- sample-app-semantic-guardrails: The directory with the sample notebook for testing the Semantic Guardrails feature.
- sample-app-synthetic-data: The directory with the sample notebook and datastore for testing the Synthetic Data feature.
- sample-data: The directory with the input file for the scripts.
- config.json: The configuration file for working with the scripts.
protegrity-developer-python repository
This repository contains the source files for customizing and building the Python module.
- LICENSE: The license file with the terms and conditions for using the application.
- README.md: The readme file for working with the Python file.
- pyproject.toml: The configuration file for the script.
- pytest.ini: The configuration file for the Pytest framework.
- requirements.txt: The configuration file for the script.
- setup.cfg: The additional settings for packaging and tools.
- src: The core implementation of the Python module.
- tests: The unit and integration tests to ensure that the Python module works as expected.
- .gitignore: The files and directories to ignore in version control.
- .pylintrc: The linting rules for code quality are defined here.
- CHANGELOG.md: It tracks version updates and changes are tracked here.
- CONTRIBUTIONS.md: The guidelines for contributing to the project.
protegrity-developer-java repository
This repository contains the source files for customizing and building the Java library.
- LICENSE: The license file with the terms and conditions for using the application.
- README.md: The readme file for working with the Java library.
- pom.xml: The Maven Project Object Model file for building the Java project.
- .gitignore: The files and directories to ignore in version control.
- run-integration-tests.sh: The shell script to execute integration tests easily.
- mvnw and mvnw.cmd: The Maven Wrapper scripts for Linux, Mac, and Windows.
- protegrity-developer-edition: The additional modules or extensions for the Developer Edition.
- integration-tests: The integration tests for validating the Java library functionality.
- application-protector-java: The Java library implementation for the Application Protector service. It includes source code and configuration files.
- .mvn/wrapper: The Maven Wrapper configuration files.