Setting up Synthetic Data

Installation instructions for the Synthetic Data feature.

Use the containers to set up the Synthetic Data feature for data generation.

  1. Open a command prompt.

  2. Navigate to the cloned repository location for protegrity-ai-developer-edition.

  3. Run the following command to download and start the containers. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy.

    cd synthetic-data
    docker compose up -d
    

    Based on your configuration use the docker-compose up -d command.

    Note: By default images are obtained from ghcr.io. To obtain images from public.ecr.aws, navigate to the synthetic-data directory and copy the .env.example file to .env. Open the .env file and uncomment the REGISTRY=public.ecr.aws/protegrity-ai-developer-edition line in the file. Save the file and run the docker compose up -d command to download and start the containers.

  4. Verify that the containers started successfully.

    docker compose logs
    
  5. Set up the Jupyter notebook for working with the notebooks provided from the cloned repository location for protegrity-ai-developer-edition.

    pip install -r shared/requirements.txt
    
  6. Install the Synthetic Data SDK package.

    pip install protegrity-synthetic-data-sdk
    
  1. Open a command prompt.

  2. Navigate to the cloned repository location for protegrity-ai-developer-edition.

  3. If the step to stop containers was missed earlier, then use the following commands to identify and remove the AI Developer Edition containers.

    docker compose down --remove-orphans
    
  4. Delete the docker network resources.

    docker network rm -f <network_name_or_id>
    

    For example,

    docker network rm -f protegrity-network
    
  5. Run the following command to download and start the containers. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy.

    cd synthetic-data
    docker compose up -d
    

    Based on your configuration use the docker-compose up -d command.

  6. Verify that the containers started successfully.

    docker compose logs
    
  7. Set up the Jupyter notebook for working with the notebooks provided from the cloned repository location for protegrity-ai-developer-edition.

    pip install -r shared/requirements.txt
    
  8. Upgrade the Synthetic Data SDK package.

    pip install --upgrade protegrity-synthetic-data-sdk
    

Last modified : June 25, 2026