Using the Anonymization APIs

Listing the APIs for Anonymization.

client

Anonymization SDK Client.

Provides synchronous (AnonymizationClient) and asynchronous (AsyncAnonymizationClient) Python clients for the Anonymization anonymization API.

Public models, enums, and exceptions are re-exported here for backward compatibility so that from anonymization_sdk.client import X continues to work.

AnonymizationClient

class AnonymizationClient()

Synchronous client for the Anonymization anonymization API.

Arguments:

  • base_url - Base URL of the Anonymization API (default: http://localhost:8000)
  • timeout - Request timeout in seconds (default: 30)
  • headers - Additional headers to include in requests

__init__
def __init__(base_url: str = DEFAULT_BASE_URL,
             timeout: float = DEFAULT_TIMEOUT,
             headers: dict[str, str] | None = None,
             mlops_config: dict[str, Any] | None = None)

Initialize the Anonymization client.

Arguments:

  • base_url - Base URL of the Anonymization API
  • timeout - Request timeout in seconds
  • headers - Additional HTTP headers to include in requests
  • mlops_config - Default MLOps tracking configuration applied to every anonymize, auto_anonymize, apply_anon, and calculate_risk call. Can be overridden per-call by passing mlops_config explicitly.

close
def close() -> None

Close the HTTP client.

is_healthy
def is_healthy() -> bool

Check if the API is healthy and responding.

Returns:

True if the API is reachable and healthy, False otherwise.

get_health
def get_health() -> dict[str, Any]

Get detailed health information from the API.

Returns:

Dictionary with health status, version, and component states.

Raises:

  • APIError - If the API returns an error status.

detect_qi
def detect_qi(data: DataInputType,
              *,
              mode: DetectionMode | str = DetectionMode.AUTO,
              sampling_method: SamplingMethod | str = SamplingMethod.FAST,
              cumulative_importance_threshold: float = 0.8,
              max_quasi_identifiers: int = 10,
              uniqueness_threshold: float = 0.95,
              known_identifiers: list[str] | None = None,
              known_sensitive: list[str] | None = None,
              ignore_columns: list[str] | None = None) -> DetectionResult

Detect quasi-identifiers in a dataset.

Arguments:

  • data - Inline records (List[Dict]), local file path / file:// URI, or cloud URI (s3://, gs://, azure://, etc.). Local paths are read and encoded automatically.
  • mode - Detection algorithm (“auto”, “ml”, “heuristic”).
  • sampling_method - Sampling strategy (“fast”, “full”, “adaptive”).
  • cumulative_importance_threshold - Stop adding QIs at this cumulative importance threshold (0.0–1.0, default 0.8).
  • max_quasi_identifiers - Maximum QIs to return (default 10).
  • uniqueness_threshold - Columns above this uniqueness ratio are flagged as direct identifiers (0.0–1.0, default 0.95).
  • known_identifiers - Columns you know are direct identifiers.
  • known_sensitive - Columns you know are sensitive.
  • ignore_columns - Columns to skip during detection.

Returns:

DetectionResult with quasi_identifiers, direct_identifiers, sensitive_attributes, attributes, and optional model_metrics.

Raises:

  • APIError - If the API returns an error.
  • ValidationError - If the request is invalid.

generate_config
def generate_config(data: DataInputType,
                    *,
                    privacy_model: PrivacyModel
                    | str = PrivacyModel.K_ANONYMITY,
                    k: int = 5,
                    l: int | None = None,
                    t: float | None = None,
                    mode: DetectionMode | str = DetectionMode.AUTO,
                    **kwargs) -> AutoConfigResult

Generate anonymization configuration automatically.

Arguments:

  • data - Inline records (List[Dict]), local file path, or cloud URI.
  • privacy_model - Privacy model (“k-anonymity”, “l-diversity”, “t-closeness”).
  • k - K value (default 5).
  • l - L value for l-diversity.
  • t - T threshold for t-closeness.
  • mode - Detection algorithm (“auto”, “ml”, “heuristic”).
  • **kwargs - max_suppression, diversity_type, distance_metric, sampling_method.

Returns:

AutoConfigResult with detection results and a ready-to-use anonymize_request configuration dict.

calculate_risk
def calculate_risk(data: DataInputType,
                   quasi_identifiers: list[str] | None = None,
                   *,
                   risk_threshold: float = 0.2,
                   suppress_value: str = "*",
                   include_prosecutor: bool = True,
                   include_journalist: bool = True,
                   include_marketer: bool = True,
                   mlops_config: dict[str, Any] | None = None) -> RiskResult

Calculate re-identification risk metrics.

Arguments:

  • data - Inline records (List[Dict]), local file path, or cloud URI.
  • quasi_identifiers - QI column names to consider for risk.
  • risk_threshold - Records above this threshold are “at risk” (default 0.2).
  • suppress_value - Value marking suppressed records (default “*”).
  • include_prosecutor - Calculate prosecutor risk (default True).
  • include_journalist - Calculate journalist risk (default True).
  • include_marketer - Calculate marketer risk (default True).
  • mlops_config - MLOps config override.

Returns:

RiskResult with prosecutor, journalist, marketer risk models and k_anonymity, highest_risk_level, equivalence class statistics.

anonymize
def anonymize(data: DataInputType,
              *,
              privacy_model: PrivacyModel | str = PrivacyModel.K_ANONYMITY,
              k: int = 5,
              l: int | None = None,
              t: float | None = None,
              attributes: list[dict[str, Any]] | None = None,
              max_suppression: float = 0.0,
              output_uri: str | None = None,
              output_format: str = "csv",
              mlops_config: dict[str, Any] | None = None,
              **kwargs) -> AnonymizeResult

Anonymize data synchronously using the specified privacy model.

Arguments:

  • data - Inline records (List[Dict]), local file path / file:// URI, or cloud URI (s3://, gs://, azure://, etc.). Local paths are read and encoded automatically.
  • privacy_model - Privacy model (“k-anonymity”, “l-diversity”, “t-closeness”).
  • k - K value for k-anonymity (default 5).
  • l - L value for l-diversity.
  • t - T threshold for t-closeness (0.0–1.0).
  • attributes - Attribute configurations - list of dicts with name, type (“quasi_identifier”, “sensitive”, “identifier”, “insensitive”), and optional hierarchy.
  • max_suppression - Maximum fraction of records to suppress (0.0–1.0).
  • output_uri - Cloud URI to write results to instead of returning inline (e.g. "s3://bucket/output.csv"). When set, result_path is populated in the response instead of data.
  • output_format - Format for cloud output (“csv”, “parquet”, “json”).
  • mlops_config - MLOps tracking configuration.
  • **kwargs - diversity_type, distance_metric, use_lattice_search, etc.

Returns:

AnonymizeResult with data (inline), or result_path (cloud output), row_count, suppressed_count, and metrics.

submit_job
def submit_job(data: DataInputType,
               *,
               privacy_model: PrivacyModel | str = PrivacyModel.K_ANONYMITY,
               k: int = 5,
               l: int | None = None,
               t: float | None = None,
               attributes: list[dict[str, Any]] | None = None,
               max_suppression: float = 0.0,
               **kwargs) -> JobResponse

Submit an anonymization job for asynchronous processing.

Arguments:

  • data - Inline records (List[Dict]), local file path, or cloud URI.
  • privacy_model - Privacy model (“k-anonymity”, “l-diversity”, “t-closeness”).
  • k - K value for k-anonymity (default 5).
  • l - L value for l-diversity.
  • t - T threshold for t-closeness.
  • attributes - Attribute configurations.
  • max_suppression - Maximum suppression rate (0.0–1.0).
  • **kwargs - Additional parameters (diversity_type, distance_metric).

Returns:

JobResponse with job_id, status, message, and created_at timestamp.

get_job_status
def get_job_status(job_id: str) -> JobStatusResponse

Get the status of an anonymization job.

Poll this method to track progress of jobs submitted via submit_job(). The response includes progress percentage, status, timestamps, and any error messages if the job failed.

Arguments:

  • job_id - Unique job identifier returned by submit_job()

Returns:

JobStatusResponse with:

  • job_id: Job identifier
  • status: Current status (pending, running, completed, failed, cancelled)
  • progress: Progress percentage (0-100)
  • message: Status message
  • created_at: Job creation timestamp
  • updated_at: Last update timestamp
  • completed_at: Completion timestamp (if completed)
  • result_path: Path to result file (if completed)
  • error: Error message (if failed)

Raises:

  • APIError - If job not found or API call fails

cancel_job
def cancel_job(job_id: str) -> None

Cancel a pending or running anonymization job.

Cancels a job that was submitted via submit_job(). Only jobs with status PENDING or RUNNING can be cancelled. Completed, failed, or already cancelled jobs cannot be cancelled.

Arguments:

  • job_id - Unique job identifier returned by submit_job()

Raises:

  • APIError - If job not found or cannot be cancelled

apply_anon
def apply_anon(job_id: str,
               data: DataInputType,
               *,
               mlops_config: dict[str, Any] | None = None) -> "ApplyResult"

Apply a saved anonymization solution to new data.

Re-uses the generalization levels computed during a prior anonymize() call identified by job_id. The lattice is not recomputed.

Arguments:

  • job_id - Solution identifier returned in AnonymizeResult.job_id.
  • data - Inline records (List[Dict]), local file path, or cloud URI.
  • mlops_config - Optional per-request MLOps tracking configuration.

Returns:

ApplyResult with anonymized data, row/suppressed counts, source_job_id, and privacy_model.

list_models
def list_models(*,
                model_type: str | None = None,
                all_metrics: bool = False) -> dict[str, Any]

List tracked anonymization models in Production.

Arguments:

  • model_type - Optional filter by privacy model type (e.g. “k-anonymity”).
  • all_metrics - If True, return all metrics instead of only the promotion metric.

Returns:

Raw response dict with ‘models’ list and ‘count’.

list_jobs
def list_jobs(*,
              status: JobStatus | str | None = None,
              limit: int = 100,
              offset: int = 0) -> "JobListResult"

List / browse all jobs with optional status filter and pagination.

Returns newest jobs first.

Arguments:

  • status - Optional filter (e.g. JobStatus.COMPLETED or “failed”)
  • limit - Page size (1-1000, default 100)
  • offset - Page offset (default 0)

Returns:

JobListResult with jobs list, total count, limit, and offset.

Raises:

  • APIError - If the API call fails.

get_job_history
def get_job_history(job_id: str) -> list["JobHistoryEntry"]

Get the full state-transition audit trail for a job.

Each create/update call on the server appends an entry with the status, step, progress, and timestamp at that point.

Arguments:

  • job_id - Unique job identifier.

Returns:

List of JobHistoryEntry ordered by sequence.

Raises:

  • APIError - If job not found or API call fails.

wait_for_job
def wait_for_job(job_id: str,
                 *,
                 poll_interval: float = 2.0,
                 timeout: float = 600.0,
                 callback: Any | None = None) -> JobStatusResponse

Poll a job until it reaches a terminal state and return its status.

Arguments:

  • job_id - Unique job identifier returned by submit_job().
  • poll_interval - Seconds between status polls (default 2s).
  • timeout - Maximum seconds to wait (default 600s / 10 min).
  • callback - Optional callable (JobStatusResponse) -> None invoked after each poll.

Returns:

JobStatusResponse at the terminal state. The anonymization result (if completed) is available in status.context["result"].

Raises:

  • APIError - If the job ends in a failed state.
  • TimeoutError - If the job does not complete within timeout.

auto_anonymize
def auto_anonymize(data: DataInputType,
                   *,
                   privacy_model: PrivacyModel
                   | str = PrivacyModel.K_ANONYMITY,
                   k: int = 5,
                   l: int | None = None,
                   t: float | None = None,
                   mode: DetectionMode | str = DetectionMode.AUTO,
                   mlops_config: dict[str, Any] | None = None,
                   **kwargs) -> AutoAnonymizeResult

Automatically detect QIs and anonymize in one step.

Arguments:

  • data - Inline records (List[Dict]), local file path, or cloud URI.
  • privacy_model - Privacy model (“k-anonymity”, “l-diversity”, “t-closeness”).
  • k - K value (default 5).
  • l - L value for l-diversity.
  • t - T threshold for t-closeness.
  • mode - Detection algorithm (“auto”, “ml”, “heuristic”).
  • mlops_config - MLOps tracking configuration.
  • **kwargs - max_suppression, sampling_method, use_lattice_search, etc.

Returns:

AutoAnonymizeResult with detection results and anonymized data.

validate
def validate(
        data: DataInputType,
        quasi_identifiers: list[str] | None = None,
        *,
        privacy_model: PrivacyModel | str = PrivacyModel.K_ANONYMITY,
        k: int = 5,
        l: int | None = None,
        t: float | None = None,
        sensitive_attributes: list[str] | None = None) -> ValidationResult

Validate that data meets privacy requirements.

Arguments:

  • data - Inline records (List[Dict]), local file path, or cloud URI.
  • quasi_identifiers - QI column names to check.
  • privacy_model - Privacy model to validate against.
  • k - Required k for k-anonymity (default 5).
  • l - Required l for l-diversity.
  • t - Required t for t-closeness.
  • sensitive_attributes - Sensitive columns (required for l-diversity/t-closeness).

Returns:

ValidationResult with is_valid, model_type, violations, statistics.

measure
def measure(original_data: DataInputType,
            anonymized_data: DataInputType,
            quasi_identifiers: list[str] | None = None) -> MetricsResult

Measure anonymization quality metrics.

Arguments:

  • original_data - Original dataset - inline records, local path, or cloud URI.
  • anonymized_data - Anonymized dataset - inline records, local path, or cloud URI.
  • quasi_identifiers - QI column names that were generalized.

Returns:

MetricsResult with information_loss and detailed metrics.

create_pattern
def create_pattern(name: str,
                   classification: str,
                   column_patterns: list[str],
                   *,
                   priority: int = 50,
                   value_patterns: list[str] | None = None,
                   min_match_ratio: float = 0.8,
                   description: str | None = None) -> Pattern

Create a custom detection pattern.

Patterns are used during QI detection to automatically classify columns based on their names and values. Custom patterns take precedence over built-in patterns.

Arguments:

  • name - Unique name for the pattern (e.g., ‘customer_id’)
  • classification - Classification type - one of:
    • “DI”: Direct Identifier (e.g., SSN, email)
    • “QI”: Quasi-Identifier (e.g., age, zipcode)
    • “SI”: Sensitive Identifier (e.g., salary, diagnosis)
    • “NSI”: Non-Sensitive Identifier (safe to publish)
  • column_patterns - List of column name patterns to match. Case-insensitive. Use ‘’ as wildcard (e.g., [’_id’, ‘user*’])
  • priority - Priority level (1-1000, lower = checked first). Default: 50
  • value_patterns - Optional list of regex patterns for value validation
  • min_match_ratio - Minimum ratio of values that must match (0-1). Default: 0.8
  • description - Optional description of what this pattern detects

Returns:

Pattern object with assigned ID and metadata

Raises:

  • APIError - If creation fails (e.g., duplicate name)
  • ValidationError - If parameters are invalid

list_patterns
def list_patterns(classification: str | None = None) -> PatternListResult

List all custom detection patterns.

Arguments:

  • classification - Optional filter by classification (DI, QI, SI, NSI)

Returns:

PatternListResult containing list of patterns and total count

get_pattern
def get_pattern(pattern_id: str) -> Pattern

Get a specific pattern by ID.

Arguments:

  • pattern_id - The pattern ID to retrieve

Returns:

Pattern object

Raises:

  • APIError - If pattern not found (404)

update_pattern
def update_pattern(pattern_id: str,
                   *,
                   name: str | None = None,
                   classification: str | None = None,
                   column_patterns: list[str] | None = None,
                   priority: int | None = None,
                   value_patterns: list[str] | None = None,
                   min_match_ratio: float | None = None,
                   description: str | None = None) -> Pattern

Update an existing pattern.

Only provided fields will be updated; others remain unchanged.

Arguments:

  • pattern_id - The pattern ID to update
  • name - New name for the pattern
  • classification - New classification (DI, QI, SI, NSI)
  • column_patterns - New column name patterns
  • priority - New priority (1-1000)
  • value_patterns - New value regex patterns
  • min_match_ratio - New minimum match ratio (0-1)
  • description - New description

Returns:

Updated Pattern object

Raises:

  • APIError - If pattern not found or update fails
  • ValidationError - If parameters are invalid

delete_pattern
def delete_pattern(pattern_id: str) -> dict[str, Any]

Delete a pattern by ID.

Arguments:

  • pattern_id - The pattern ID to delete

Returns:

Dictionary with confirmation message

Raises:

  • APIError - If pattern not found (404)

delete_all_patterns
def delete_all_patterns() -> dict[str, Any]

Delete all custom patterns.

WARNING: This removes all customer-defined patterns. Built-in patterns from the YAML config are not affected.

Returns:

Dictionary with count of deleted patterns

reload_patterns
def reload_patterns() -> dict[str, Any]

Reload patterns from storage file.

Use this to sync after manual file edits.

Returns:

Dictionary with count of reloaded patterns

dp_compute
def dp_compute(data: DataInputType,
               *,
               mechanism: DPMechanismType | str = DPMechanismType.MEAN,
               column: str | None = None,
               columns: list[str] | None = None,
               group_by: str | None = None,
               epsilon: float = 1.0,
               delta: float = 0.0,
               noise_type: DPNoiseType | str = DPNoiseType.LAPLACE,
               bounds: tuple | None = None,
               bins: int | None = None,
               histogram_range: tuple | None = None,
               session_id: str | None = None,
               predicate: str | None = None,
               candidates: list | None = None,
               utility_scores: list[float] | None = None,
               sensitivity: float | None = None,
               epsilon_map: dict[str, float] | None = None,
               min_group_size: int | None = None) -> DPComputeResult

Compute a differentially private statistic on a data column.

Arguments:

  • data - Inline records (List[Dict]), local file path, or cloud URI.
  • mechanism - DP mechanism (“mean”, “sum”, “variance”, “histogram”, “count”, “exponential”).
  • column - Column name for single-column queries.
  • columns - Column names for multi-column queries.
  • group_by - Categorical column to group by.
  • epsilon - Privacy parameter epsilon (>0).
  • delta - Privacy parameter delta (>=0, <1).
  • noise_type - “laplace” or “gaussian”.
  • bounds - (lower, upper) clipping bounds. Required for mean/sum/variance.
  • bins - Number of histogram bins (histogram only).
  • histogram_range - (min, max) range for histogram bins.
  • session_id - Budget session ID for cumulative tracking.
  • predicate - Filter expression (e.g., “> 50”, “<= 100”).
  • candidates - Candidate outputs (exponential mechanism only).
  • utility_scores - Utility scores for candidates (exponential only).
  • sensitivity - Utility function sensitivity (exponential only).
  • epsilon_map - Per-column or per-group epsilon overrides.
  • min_group_size - Minimum rows per group (default 5).

Returns:

DPComputeResult with private_value (single) or results dict (multi/group).

dp_stream_update
def dp_stream_update(session_id: str | None = None,
                     data: DataInputType | None = None,
                     *,
                     column: str | None = None,
                     columns: list[str] | None = None,
                     group_by: str | None = None,
                     mechanism: DPStreamMechanismType | str | None = None,
                     epsilon: float | None = None,
                     delta: float | None = None,
                     noise_type: DPNoiseType | str | None = None,
                     bounds: tuple | None = None,
                     get_result: bool = False,
                     window_size: int | None = None,
                     epsilon_map: dict[str, float] | None = None,
                     min_group_size: int | None = None,
                     budget_session_id: str | None = None) -> DPStreamResult

Feed data into a streaming DP session.

On the first call for a session_id, provide mechanism, epsilon, and bounds. Subsequent calls only need session_id, data, and column.

Arguments:

  • session_id - Unique session identifier.
  • data - Batch of records. Mutually exclusive with data_path.
  • data_path - Cloud/local URI for data batch.
  • column - Column name for single-column streaming.
  • columns - Column names for multi-column streaming.
  • group_by - Categorical column to group by.
  • mechanism - Streaming mechanism. Required on first call.
  • epsilon - Privacy epsilon. Required on first call.
  • delta - Privacy delta.
  • noise_type - Noise mechanism.
  • bounds - Clipping bounds. Required on first call (except for count).
  • get_result - If True, also return the current private result.
  • window_size - Window size for sliding/tumbling window mechanisms.
  • epsilon_map - Per-column or per-group epsilon overrides.
  • min_group_size - Minimum rows per group (default 5).
  • budget_session_id - Link to a budget session for automatic deduction.

Returns:

DPStreamResult with session status and optional results.

dp_stream_delete
def dp_stream_delete(session_id: str) -> None

Delete a streaming DP session.

Arguments:

  • session_id - Session to delete.

dp_stream_list_sessions
def dp_stream_list_sessions() -> list

List all active streaming DP sessions.

Returns:

List of dicts with session_id, mechanism, column, batches_processed, total_count.

dp_budget_create
def dp_budget_create(session_id: str,
                     epsilon_budget: float,
                     delta_budget: float = 0.0,
                     composition: str = "basic") -> DPBudgetStatus

Create a privacy budget session.

Arguments:

  • session_id - Unique session identifier.
  • epsilon_budget - Total epsilon budget.
  • delta_budget - Total delta budget.
  • composition - Composition mode (“basic” or “rdp”). RDP requires delta_budget > 0 and yields tighter privacy accounting.

Returns:

DPBudgetStatus with initial budget state.

dp_budget_status
def dp_budget_status(session_id: str) -> DPBudgetStatus

Get privacy budget status for a session.

Arguments:

  • session_id - Session to query.

Returns:

DPBudgetStatus with current spend and remaining budget.

dp_budget_delete
def dp_budget_delete(session_id: str) -> None

Delete a privacy budget session.

Arguments:

  • session_id - Session to delete.

dp_advise_composition
def dp_advise_composition(epsilon_budget: float,
                          num_queries: int,
                          delta_budget: float = 0.0,
                          delta_per_query: float = 0.0) -> dict

Get composition advice for planned queries.

Returns optimal per-query epsilon under basic and RDP composition with a recommendation.

Arguments:

  • epsilon_budget - Total epsilon budget available.
  • num_queries - Number of planned queries.
  • delta_budget - Total delta budget (required for RDP comparison).
  • delta_per_query - Delta per query for Gaussian noise. 0 = Laplace.

Returns:

Dict with basic/rdp analysis, recommendation, and savings_pct.

audit_list
def audit_list(*,
               operation: str | None = None,
               status: str | None = None,
               limit: int = 50,
               offset: int = 0) -> list[AuditEntry]

List audit log entries.

Arguments:

  • operation - Filter by operation (dp_compute, anonymize_sync, …).
  • status - Filter by outcome (‘success’ or ’error’).
  • limit - Max entries to return (1–500).
  • offset - Pagination offset.

Returns:

List of AuditEntry objects.

audit_get
def audit_get(entry_id: str) -> AuditEntry

Get a single audit entry.

Arguments:

  • entry_id - Audit entry ID.

Returns:

AuditEntry with full details.

Raises:

  • APIError - If entry not found (404).

AsyncAnonymizationClient

class AsyncAnonymizationClient()

Asynchronous client for the Anonymization anonymization API.

Same interface as AnonymizationClient but with async/await support.

__init__
def __init__(base_url: str = DEFAULT_BASE_URL,
             timeout: float = DEFAULT_TIMEOUT,
             headers: dict[str, str] | None = None,
             mlops_config: dict[str, Any] | None = None)

Initialize the async Anonymization client.

Arguments:

  • base_url - Base URL of the Anonymization API
  • timeout - Request timeout in seconds
  • headers - Additional HTTP headers to include in requests
  • mlops_config - Default MLOps tracking configuration applied to every anonymize, auto_anonymize, apply_anon, and calculate_risk call. Can be overridden per-call.

close
async def close() -> None

Close the HTTP client.

is_healthy
async def is_healthy() -> bool

Check if the API is healthy and responding.

get_health
async def get_health() -> dict[str, Any]

Get detailed health information.

detect_qi
async def detect_qi(
        data: DataInputType,
        *,
        mode: DetectionMode | str = DetectionMode.AUTO,
        sampling_method: SamplingMethod | str = SamplingMethod.FAST,
        cumulative_importance_threshold: float = 0.8,
        max_quasi_identifiers: int = 10,
        uniqueness_threshold: float = 0.95,
        known_identifiers: list[str] | None = None,
        known_sensitive: list[str] | None = None,
        ignore_columns: list[str] | None = None) -> DetectionResult

Detect quasi-identifiers (async version).

Refer to synchronous detect_qi() for full documentation.

generate_config
async def generate_config(data: DataInputType,
                          *,
                          privacy_model: PrivacyModel
                          | str = PrivacyModel.K_ANONYMITY,
                          k: int = 5,
                          l: int | None = None,
                          t: float | None = None,
                          mode: DetectionMode | str = DetectionMode.AUTO,
                          **kwargs) -> AutoConfigResult

Generate anonymization configuration automatically (async version).

calculate_risk
async def calculate_risk(
        data: DataInputType,
        quasi_identifiers: list[str] | None = None,
        *,
        risk_threshold: float = 0.2,
        suppress_value: str = "*",
        include_prosecutor: bool = True,
        include_journalist: bool = True,
        include_marketer: bool = True,
        mlops_config: dict[str, Any] | None = None) -> RiskResult

Calculate re-identification risk metrics (async version).

anonymize
async def anonymize(data: DataInputType,
                    *,
                    privacy_model: PrivacyModel
                    | str = PrivacyModel.K_ANONYMITY,
                    k: int = 5,
                    l: int | None = None,
                    t: float | None = None,
                    attributes: list[dict[str, Any]] | None = None,
                    max_suppression: float = 0.0,
                    output_uri: str | None = None,
                    output_format: str = "csv",
                    mlops_config: dict[str, Any] | None = None,
                    **kwargs) -> AnonymizeResult

Anonymize data (async version). Refer to synchronous anonymize() for full documentation.

submit_job
async def submit_job(data: DataInputType,
                     *,
                     privacy_model: PrivacyModel
                     | str = PrivacyModel.K_ANONYMITY,
                     k: int = 5,
                     l: int | None = None,
                     t: float | None = None,
                     attributes: list[dict[str, Any]] | None = None,
                     max_suppression: float = 0.0,
                     **kwargs) -> JobResponse

Submit anonymization job (async version).

Refer to synchronous submit_job() for full documentation.

get_job_status
async def get_job_status(job_id: str) -> JobStatusResponse

Get job status (async version).

Refer to synchronous get_job_status() for full documentation.

cancel_job
async def cancel_job(job_id: str) -> None

Cancel job (async version). Refer to synchronous cancel_job() for full documentation.

apply_anon
async def apply_anon(
        job_id: str,
        data: DataInputType,
        *,
        mlops_config: dict[str, Any] | None = None) -> "ApplyResult"

Apply saved anonymization (async). Refer to synchronous apply_anon() for full docs.

list_models
async def list_models(*,
                      model_type: str | None = None,
                      all_metrics: bool = False) -> dict[str, Any]

List tracked anonymization models (async).

Refer to synchronous list_models() for full docs.

list_jobs
async def list_jobs(*,
                    status: JobStatus | str | None = None,
                    limit: int = 100,
                    offset: int = 0) -> "JobListResult"

List jobs (async version). Refer to synchronous list_jobs() for full documentation.

get_job_history
async def get_job_history(job_id: str) -> list["JobHistoryEntry"]

Get job history (async version).

Refer to synchronous get_job_history() for full documentation.

wait_for_job
async def wait_for_job(job_id: str,
                       *,
                       poll_interval: float = 2.0,
                       timeout: float = 600.0,
                       callback: Any | None = None) -> JobStatusResponse

Async version of wait_for_job().

Refer to synchronous wait_for_job() for full documentation.

auto_anonymize
async def auto_anonymize(data: DataInputType,
                         *,
                         privacy_model: PrivacyModel
                         | str = PrivacyModel.K_ANONYMITY,
                         k: int = 5,
                         l: int | None = None,
                         t: float | None = None,
                         mode: DetectionMode | str = DetectionMode.AUTO,
                         mlops_config: dict[str, Any] | None = None,
                         **kwargs) -> AutoAnonymizeResult

Auto-detect and anonymize (async version).

Refer to synchronous auto_anonymize() for full docs.

validate
async def validate(
        data: DataInputType,
        quasi_identifiers: list[str] | None = None,
        *,
        privacy_model: PrivacyModel | str = PrivacyModel.K_ANONYMITY,
        k: int = 5,
        l: int | None = None,
        t: float | None = None,
        sensitive_attributes: list[str] | None = None) -> ValidationResult

Validate privacy requirements (async version).

measure
async def measure(original_data: DataInputType,
                  anonymized_data: DataInputType,
                  quasi_identifiers: list[str] | None = None) -> MetricsResult

Measure anonymization quality metrics (async version).

create_pattern
async def create_pattern(name: str,
                         classification: str,
                         column_patterns: list[str],
                         *,
                         priority: int = 50,
                         value_patterns: list[str] | None = None,
                         min_match_ratio: float = 0.8,
                         description: str | None = None) -> Pattern

Create a custom detection pattern (async version).

list_patterns
async def list_patterns(
        classification: str | None = None) -> PatternListResult

List all custom detection patterns (async version).

get_pattern
async def get_pattern(pattern_id: str) -> Pattern

Get a specific pattern by ID (async version).

update_pattern
async def update_pattern(pattern_id: str,
                         *,
                         name: str | None = None,
                         classification: str | None = None,
                         column_patterns: list[str] | None = None,
                         priority: int | None = None,
                         value_patterns: list[str] | None = None,
                         min_match_ratio: float | None = None,
                         description: str | None = None) -> Pattern

Update an existing pattern (async version).

delete_pattern
async def delete_pattern(pattern_id: str) -> dict[str, Any]

Delete a pattern by ID (async version).

delete_all_patterns
async def delete_all_patterns() -> dict[str, Any]

Delete all custom patterns (async version).

reload_patterns
async def reload_patterns() -> dict[str, Any]

Reload patterns from storage file (async version).

audit_list
async def audit_list(*,
                     operation: str | None = None,
                     status: str | None = None,
                     limit: int = 50,
                     offset: int = 0) -> list[AuditEntry]

List audit log entries (async version).

audit_get
async def audit_get(entry_id: str) -> AuditEntry

Get a single audit entry (async version).

exceptions

Anonymization SDK Exceptions.

Custom exception hierarchy for the Anonymization SDK client library. All SDK exceptions inherit from AnonymizationClientError.

AnonymizationClientError

class AnonymizationClientError(Exception)

Base exception for all SDK errors.

ValidationError

class ValidationError(AnonymizationClientError)

Request validation failed (422 from server or client-side validation).

APIError

class APIError(AnonymizationClientError)

API returned an error response (4xx or 5xx status code).

AnonymizationConnectionError

class AnonymizationConnectionError(AnonymizationClientError)

Failed to connect to the API (network/timeout error).

TierRestrictionError

class TierRestrictionError(AnonymizationClientError)

Feature not available in the current server tier (403 from server).

The server returned a tier-restriction error indicating the requested feature requires a higher tier. Inspect the structured fields for details.

models

Anonymization SDK Response Models and Enums.

Contains all enums (PrivacyModel, DetectionMode, etc.) and response dataclasses (DetectionResult, RiskResult, AnonymizeResult, etc.) used by both the synchronous and asynchronous Anonymization clients.

PrivacyModel

class PrivacyModel(StrEnum)

Supported privacy models.

DetectionMode

class DetectionMode(StrEnum)

QI detection algorithm modes.

SamplingMethod

class SamplingMethod(StrEnum)

Sampling methods for detection.

RiskLevel

class RiskLevel(StrEnum)

Risk level classifications.

JobStatus

class JobStatus(StrEnum)

Job execution status.

AttributeClassification

@dataclass
class AttributeClassification()

Classification result for a single attribute.

ModelMetrics

@dataclass
class ModelMetrics()

ML model performance metrics.

DetectionResult

@dataclass
class DetectionResult()

Result of QI detection.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "DetectionResult"

Create from API response dict.

ProsecutorRisk

@dataclass
class ProsecutorRisk(_BaseAttackerRisk)

Prosecutor risk model result.

JournalistRisk

@dataclass
class JournalistRisk(_BaseAttackerRisk)

Journalist risk model result.

MarketerRisk

@dataclass
class MarketerRisk()

Marketer risk model result.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "MarketerRisk"

Create from API response dict.

RiskResult

@dataclass
class RiskResult()

Complete risk metrics result.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "RiskResult"

Create from API response dict.

is_k_anonymous
def is_k_anonymous(k: int) -> bool

Check if data satisfies k-anonymity.

MetricsResult

@dataclass
class MetricsResult()

Anonymization quality metrics.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "MetricsResult"

Create from API response dict.

AnonymizeResult

@dataclass
class AnonymizeResult()

Result of anonymization operation.

result_path

Cloud storage URI if saved to cloud

job_id

Solution identifier for apply_anon()

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "AnonymizeResult"

Create from API response dict.

ApplyResult

@dataclass
class ApplyResult()

Result of applying a saved anonymization solution to new data.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "ApplyResult"

Create from API response dict.

ValidationResult

@dataclass
class ValidationResult()

Result of privacy validation.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "ValidationResult"

Create from API response dict.

AutoConfigResult

@dataclass
class AutoConfigResult()

Result of auto-configuration generation.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "AutoConfigResult"

Create from API response dict.

AutoAnonymizeResult

@dataclass
class AutoAnonymizeResult()

Result of combined detection + anonymization.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "AutoAnonymizeResult"

Create from API response dict.

Pattern

@dataclass
class Pattern()

Detection pattern for automatic QI classification.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "Pattern"

Create from API response dict.

PatternListResult

@dataclass
class PatternListResult()

Result of pattern list operation.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "PatternListResult"

Create from API response dict.

JobResponse

@dataclass
class JobResponse()

Response for job submission.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "JobResponse"

Create from API response dict.

JobStatusResponse

@dataclass
class JobStatusResponse()

Response for job status query.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "JobStatusResponse"

Create from API response dict.

JobHistoryEntry

@dataclass
class JobHistoryEntry()

A single point-in-time snapshot from the job audit trail.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "JobHistoryEntry"

Create from API response dict.

JobListResult

@dataclass
class JobListResult()

Paginated list of jobs.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "JobListResult"

Create from API response dict.

DPMechanismType

class DPMechanismType(StrEnum)

Supported batch DP mechanisms.

DPStreamMechanismType

class DPStreamMechanismType(StrEnum)

Supported streaming DP mechanisms.

DPNoiseType

class DPNoiseType(StrEnum)

Supported noise mechanisms.

DPComputeResult

@dataclass
class DPComputeResult()

Result of a batch DP computation.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "DPComputeResult"

Create from API response dict.

DPStreamResult

@dataclass
class DPStreamResult()

Result of a streaming DP update.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "DPStreamResult"

Create from API response dict.

DPBudgetStatus

@dataclass
class DPBudgetStatus()

Privacy budget status for a session.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "DPBudgetStatus"

Create from API response dict.

AuditEntry

@dataclass
class AuditEntry()

A single audit log entry.

from_dict
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "AuditEntry"

Create from API response dict.


Last modified : June 18, 2026