Update! Alert! Attention!
We extended our work on Collective Scoping to Collaborative Scoping (Github or Accepted Paper @EDBT26), a more scalable and robust schema linkability assessment method for heterogeneous schema matching scenarios.
Scoping: Towards streamlined entity collections for multi-sourced Entity Resolution with self-supervised agents
The goal of Scoping is to reduce the space of candidate entity pairs by ranking, detecting, and removing unlinkable entities through outlier algorithms and self-supervised reusable autoencoders, leaving intact the set of true linkages.
The annotated multi-sourced entity linkage dataset is sourced from sample schemas from the following three database vendors:
-
MySQL: https://www.mysqltutorial.org/mysql-sample-database.aspx
-
Category domain-specific: 3 x Orders-Customers schemas (Oracle, MySQL, SAP HANA)
-
Category domain-agnostic: 1 additional Human-Resources schema (Oracle)
leonard.traeger@umbc.edu for any related questions
If you make advantage of the Collective Scoping method in your research, please cite the following in your manuscript:
@article{DBLP:journals/sncs/TraegerBK25,
author = {Leonard Traeger and Andreas Behrend and George Karabatis},
title = {Collective Scoping: Streamlining Entity Sets Towards Efficient and Effective Entity Linkages},
journal = {{SN} Comput. Sci.},
volume = {6},
number = {3},
pages = {238},
year = {2025},
url = {https://doi.org/10.1007/s42979-025-03734-7},
doi = {10.1007/S42979-025-03734-7}
}
