@scadsfct

Value-specific weighting for record-level encodings in privacy-preserving record linkage

, , , and . Gesellschaft für Informatik e.V., (2023)

Abstract

Privacy-preserving record linkage (PPRL) determines records representing the same entitywhile guaranteeing the privacy of individuals. A common approach is to encode plaintext data ofrecords into Bloom filters that enable efficient calculation of similarities. A crucial step of PPRL isthe classification of Bloom filter pairs as match or non-match based on computed similarities. In thecontext of record linkage, several weighting schemes and classification methods are available. Themajority of weighting methods determine and adapt weights by applying the Fellegi&Sunter modelfor each attribute. In the PPRL domain, the attributes of a record are encoded in a joint record-levelBloom filter to impede cryptanalysis attacks so that the application of existing attribute-wise weightingapproaches is not feasible. We study methods that use attribute-specific weights in record-levelencodings and integrate weight adaptation approaches based on individual value frequencies. Theexperiments on real-world datasets show that frequency-dependent weighting schemes improve thelinkage quality as well as the robustness with regard to the threshold selection.

Links and resources

Tags