ngram_fuzzy_matching_strategy#

langchain_experimental.data_anonymizer.deanonymizer_matching_strategies.ngram_fuzzy_matching_strategy(text: str, deanonymizer_mapping: Dict[str, Dict[str, str]], fuzzy_threshold: int = 85, use_variable_length: bool = True) str[source]#

N-gram fuzzy matching strategy for deanonymization.

It replaces all the anonymized entities with the original ones. It uses fuzzy matching to find the position of the anonymized entity in the text. It generates n-grams of the same length as the anonymized entity from the text and uses fuzzy matching to find the position of the anonymized entity in the text.

Parameters:
  • text (str) – text to deanonymize

  • deanonymizer_mapping (Dict[str, Dict[str, str]]) – mapping between anonymized entities and original ones

  • fuzzy_threshold (int) – fuzzy matching threshold

  • use_variable_length (bool) – whether to use (n-1, n, n+1)-grams or just n-grams

Return type:

str