ngram_fuzzy_matching_strategy#
- langchain_experimental.data_anonymizer.deanonymizer_matching_strategies.ngram_fuzzy_matching_strategy(text: str, deanonymizer_mapping: Dict[str, Dict[str, str]], fuzzy_threshold: int = 85, use_variable_length: bool = True) str [source]#
N-gram fuzzy matching strategy for deanonymization.
It replaces all the anonymized entities with the original ones. It uses fuzzy matching to find the position of the anonymized entity in the text. It generates n-grams of the same length as the anonymized entity from the text and uses fuzzy matching to find the position of the anonymized entity in the text.
- Parameters:
text (str) – text to deanonymize
deanonymizer_mapping (Dict[str, Dict[str, str]]) – mapping between anonymized entities and original ones
fuzzy_threshold (int) – fuzzy matching threshold
use_variable_length (bool) – whether to use (n-1, n, n+1)-grams or just n-grams
- Return type:
str