Abstract
Privacy-preserving record linkage (PPRL) determines records representing the same entitywhile guaranteeing the privacy of individuals. A common approach is to encode plaintext data ofrecords into Bloom filters that enable efficient calculation of similarities. A crucial step of PPRL isthe classification of Bloom filter pairs as match or non-match based on computed similarities. In thecontext of record linkage, several weighting schemes and classification methods are available. Themajority of weighting methods determine and adapt weights by applying the Fellegi&Sunter modelfor each attribute. In the PPRL domain, the attributes of a record are encoded in a joint record-levelBloom filter to impede cryptanalysis attacks so that the application of existing attribute-wise weightingapproaches is not feasible. We study methods that use attribute-specific weights in record-levelencodings and integrate weight adaptation approaches based on individual value frequencies. Theexperiments on real-world datasets show that frequency-dependent weighting schemes improve thelinkage quality as well as the robustness with regard to the threshold selection.
Users
Please
log in to take part in the discussion (add own reviews or comments).