PresidioAnonymizerBase#

class langchain_experimental.data_anonymizer.presidio.PresidioAnonymizerBase( analyzed_fields: List[str] | None = None, operators: Dict[str, OperatorConfig] | None = None, languages_config: Dict | None = None, add_default_faker_operators: bool = True, faker_seed: int | None = None, )[source]#

Base Anonymizer using Microsoft Presidio.

See more: https://microsoft.github.io/presidio/

Parameters:

analyzed_fields (Optional[List[str]]) – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.
operators (Optional[Dict[str, OperatorConfig]]) – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/
languages_config (Optional[Dict]) – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/
faker_seed (Optional[int]) – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.
add_default_faker_operators (bool)

Methods

`__init__`([analyzed_fields, operators, ...])
`add_operators`(operators)	Add operators to the anonymizer
`add_recognizer`(recognizer)	Add a recognizer to the analyzer
`anonymize`(text[, language, allow_list])	Anonymize text.

__init__( analyzed_fields: List[str] | None = None, operators: Dict[str, OperatorConfig] | None = None, languages_config: Dict | None = None, add_default_faker_operators: bool = True, faker_seed: int | None = None, )[source]#

Parameters:

analyzed_fields (Optional[List[str]]) – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.
operators (Optional[Dict[str, OperatorConfig]]) – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/
languages_config (Optional[Dict]) – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/
faker_seed (Optional[int]) – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.
add_default_faker_operators (bool)

add_operators( operators: Dict[str, OperatorConfig], ) → None[source]#

Add operators to the anonymizer

Parameters:: operators (Dict[str, OperatorConfig]) – Operators to add to the anonymizer.
Return type:: None

add_recognizer( recognizer: EntityRecognizer, ) → None[source]#

Add a recognizer to the analyzer

Parameters:: recognizer (EntityRecognizer) – Recognizer to add to the analyzer.
Return type:: None

anonymize( text: str, language: str | None = None, allow_list: List[str] | None = None, ) → str#

Anonymize text.

Parameters:

text (str)
language (str | None)
allow_list (List[str] | None)

Return type:

str