Data Discovery API

Classify API.

Data Discovery Classification Service

This API identifies, classifies, and locates sensitive data.

Endpoint

https://{Host Address}/pty/data-discovery/v1.0/classify

Path

/pty/data-discovery/v1.0/classify

Method

POST

Parameters

Define the value in the score_threshold parameter to exclude results with a low score. This parameter is optional and accepts the following values:

Type: float
Values: minimum 0, maximum 1.0
Default: 0.00

For example, score_threshold = 0.75

Example Data

You can reach Dave Elliot by phone 203-555-1286.

The data should be in UTF-8 format. Also, the limit on the length of the characters is 10,000.

Sample Request

https://{Host address}/pty/data-discovery/v1.0/classify

Response Codes

Successful Response.
{
        "providers": [
          {
            "name": "Presidio Classification Provider",
            "version": "1.0.0",
            "status": 200,
            "elapsed_time": 1.014178991317749,
            "exception": null,
            "config_provider": {
              "name": "Presidio",
              "address": "http://presidio_provider_service",
              "supported_content_types": []
            }
          },
          {
            "name": "Roberta Classification Provider",
            "version": "1.0.0",
            "status": 200,
            "elapsed_time": 19.091534852981567,
            "exception": null,
            "config_provider": {
              "name": "Roberta",
              "address": "http://roberta_provider_service",
              "supported_content_types": []
            }
          }
        ],
        "classifications": {
          "PERSON": [
            {
              "score": 0.9236000061035157,
              "location": {
                "start_index": 14,
                "end_index": 25
              },
              "classifiers": [
                {
                  "provider_index": 0,
                  "name": "SpacyRecognizer",
                  "score": 0.85,
                  "details": {}
                },
                {
                  "provider_index": 1,
                  "name": "roberta",
                  "score": 0.9972000122070312,
                  "details": {}
                }
              ]
            }
          ],
          "PHONE_NUMBER": [
            {
              "score": 0.8746500015258789,
              "location": {
                "start_index": 35,
                "end_index": 47
              },
              "classifiers": [
                {
                  "provider_index": 0,
                  "name": "PhoneRecognizer",
                  "score": 0.75,
                  "details": {}
                },
                {
                  "provider_index": 1,
                  "name": "roberta",
                  "score": 0.9993000030517578,
                  "details": {}
                }
              ]
            }
          ]
        }
      }
Request must have a body, but no request body was provided.
Payload too large.
Unsupported media type.
Unexpected internal server error. Check server logs.
Internal server error. Check server logs.

Sample Request

curl -X POST "https://<SERVER_IP>/pty/data-discovery/v1.0/classify?score_threshold=0.85" \
          -H "Content-Type: text/plain" \
          --data "You can reach Dave Elliot by phone 203-555-1286"
import requests
    
    url = "https://<SERVER_IP>/pty/data-discovery/v1.0/classify"
    params = {"score_threshold": 0.85}
    headers = {"Content-Type": "text/plain"}
    data = "You can reach Dave Elliot by phone 203-555-1286"
    
    response = requests.post(url, params=params, headers=headers, data=data, verify=False)
    
    print("Status code:", response.status_code)
    print("Response JSON:", response.json())
URL: POST `https://<SERVER_IP>/pty/data-discovery/v1.0/classify`
   Query Parameters:
   -score_threshold (optional), float between 0.0 and 1.0, default: 0.
   Headers:
   -Content-Type: text/plain
   Body:
   -You can reach Dave Elliot by phone 203-555-1286

Last modified : July 30, 2025