data_anonymizer#

Data anonymizer contains both Anonymizers and Deanonymizers. It uses the [Microsoft Presidio](https://microsoft.github.io/presidio/) library.

Anonymizers are used to replace a Personally Identifiable Information (PII) entity text with some other value by applying a certain operator (e.g. replace, mask, redact, encrypt).

Deanonymizers are used to revert the anonymization operation (e.g. to decrypt an encrypted text).

Classes

data_anonymizer.base.AnonymizerBase()

Base abstract class for anonymizers.

data_anonymizer.base.ReversibleAnonymizerBase()

Base abstract class for reversible anonymizers.

data_anonymizer.deanonymizer_mapping.DeanonymizerMapping(...)

Deanonymizer mapping.

data_anonymizer.presidio.PresidioAnonymizer([...])

Anonymizer using Microsoft Presidio.

data_anonymizer.presidio.PresidioAnonymizerBase([...])

Base Anonymizer using Microsoft Presidio.

data_anonymizer.presidio.PresidioReversibleAnonymizer([...])

Reversible Anonymizer using Microsoft Presidio.

Functions

data_anonymizer.deanonymizer_mapping.create_anonymizer_mapping(...)

Create or update the mapping used to anonymize and/or

data_anonymizer.deanonymizer_mapping.format_duplicated_operator(...)

Format the operator name with the count.

data_anonymizer.deanonymizer_matching_strategies.case_insensitive_matching_strategy(...)

Case insensitive matching strategy for deanonymization.

data_anonymizer.deanonymizer_matching_strategies.combined_exact_fuzzy_matching_strategy(...)

Combined exact and fuzzy matching strategy for deanonymization.

data_anonymizer.deanonymizer_matching_strategies.exact_matching_strategy(...)

Exact matching strategy for deanonymization.

data_anonymizer.deanonymizer_matching_strategies.fuzzy_matching_strategy(...)

Fuzzy matching strategy for deanonymization.

data_anonymizer.deanonymizer_matching_strategies.ngram_fuzzy_matching_strategy(...)

N-gram fuzzy matching strategy for deanonymization.

data_anonymizer.faker_presidio_mapping.get_pseudoanonymizer_mapping([seed])

Get a mapping of entities to pseudo anonymize them.