Scalable identity resolution, entity resolution, data mastering and deduplication using ML
-
Updated
Apr 28, 2025 - Java
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
An open source, high scalability toolkit in Java for Entity Resolution.
Entity resolution for Elasticsearch.
OpenRefine reconciliation services for VIAF, ORCID, and Open Library + framework for creating more.
ReCiter: an enterprise open source author disambiguation system for academic institutions
Minoan ER is an Entity Resolution (ER) framework, built by researchers in Crete (the land of the ancient Minoan civilization). Entity resolution aims to identify descriptions that refer to the same entity within or across knowledge bases.
UI for JedAI Toolkit
WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.
A general purpose deduplication framework
Mirror of https://bitbucket.org/resteorts/smered
Entity resolution for Elasticsearch.
Parallel Blocking in MapReduce
Java client for entity-fishing
Implementing instance matching algorithm on GeoLink repository.
Created by Halbert L. Dunn
Released 1946