PresidioReversibleAnonymizer#

class langchain_experimental.data_anonymizer.presidio.PresidioReversibleAnonymizer(analyzed_fields: List[str] | None = None, operators: Dict[str, OperatorConfig] | None = None, languages_config: Dict | None = None, add_default_faker_operators: bool = True, faker_seed: int | None = None)[source]#

Reversible Anonymizer using Microsoft Presidio.

Parameters:
  • analyzed_fields (Optional[List[str]]) – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.

  • operators (Optional[Dict[str, OperatorConfig]]) – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/

  • languages_config (Optional[Dict]) – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/

  • faker_seed (Optional[int]) – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.

  • add_default_faker_operators (bool)

Attributes

anonymizer_mapping

Return the anonymizer mapping This is just the reverse version of the deanonymizer mapping.

deanonymizer_mapping

Return the deanonymizer mapping

Methods

__init__([analyzed_fields, operators, ...])

add_operators(operators)

Add operators to the anonymizer

add_recognizer(recognizer)

Add a recognizer to the analyzer

anonymize(text[, language, allow_list])

Anonymize text.

deanonymize(text_to_deanonymize[, ...])

Deanonymize text

load_deanonymizer_mapping(file_path)

Load the deanonymizer mapping from a JSON or YAML file.

reset_deanonymizer_mapping()

Reset the deanonymizer mapping

save_deanonymizer_mapping(file_path)

Save the deanonymizer mapping to a JSON or YAML file.

__init__(analyzed_fields: List[str] | None = None, operators: Dict[str, OperatorConfig] | None = None, languages_config: Dict | None = None, add_default_faker_operators: bool = True, faker_seed: int | None = None)[source]#
Parameters:
  • analyzed_fields (Optional[List[str]]) – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.

  • operators (Optional[Dict[str, OperatorConfig]]) – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/

  • languages_config (Optional[Dict]) – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/

  • faker_seed (Optional[int]) – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.

  • add_default_faker_operators (bool)

add_operators(operators: Dict[str, OperatorConfig]) None#

Add operators to the anonymizer

Parameters:

operators (Dict[str, OperatorConfig]) – Operators to add to the anonymizer.

Return type:

None

add_recognizer(recognizer: EntityRecognizer) None#

Add a recognizer to the analyzer

Parameters:

recognizer (EntityRecognizer) – Recognizer to add to the analyzer.

Return type:

None

anonymize(text: str, language: str | None = None, allow_list: List[str] | None = None) str#

Anonymize text.

Parameters:
  • text (str)

  • language (str | None)

  • allow_list (List[str] | None)

Return type:

str

deanonymize(text_to_deanonymize: str, deanonymizer_matching_strategy: ~typing.Callable[[str, ~typing.Dict[str, ~typing.Dict[str, str]]], str] = <function exact_matching_strategy>) str#

Deanonymize text

Parameters:
  • text_to_deanonymize (str)

  • deanonymizer_matching_strategy (Callable[[str, Dict[str, Dict[str, str]]], str])

Return type:

str

load_deanonymizer_mapping(file_path: Path | str) None[source]#

Load the deanonymizer mapping from a JSON or YAML file.

Parameters:

file_path (Path | str) – Path to file to load the mapping from.

Return type:

None

Example: .. code-block:: python

anonymizer.load_deanonymizer_mapping(file_path=”path/mapping.json”)

reset_deanonymizer_mapping() None[source]#

Reset the deanonymizer mapping

Return type:

None

save_deanonymizer_mapping(file_path: Path | str) None[source]#

Save the deanonymizer mapping to a JSON or YAML file.

Parameters:

file_path (Path | str) – Path to file to save the mapping to.

Return type:

None

Example: .. code-block:: python

anonymizer.save_deanonymizer_mapping(file_path=”path/mapping.json”)

Examples using PresidioReversibleAnonymizer