Duplicate Detection


icon html

Duplicate detection (also known as entity resolution, entity identification, merge/purge problem, object matching or record linkage) is the process of identifying data items (tuples, records, XML-elements) representing the same real-world entity. The result of a duplicate detection is a partitioning of all the considered elements.  

Tools and Frameworks for Duplicate Detection

HumMer (Humboldt-Merger) Humboldt-Universitšt zu Berlin, GER
Hasso-Plattner Institut, GER
Febrl (Freely extensible biomedical record linkage) Australian National University, AU
New South Wales Department of Health , AU
Fever (Framework for EValuating Entity Resolution) University of Leipzig, GER
Oyster (Open sYSTem Entity Resolution) UALR Laboratory for Advanced Research in Entity Resolution and Information Quality (ERIQ), USA