Title |
An Optimized Method For Arabic Cross Document Named Entity Normalization |
Authors |
Khaled S. Refaat and Amgad Madkour |
Abstract |
This paper presents a technique to perform Arabic cross-document named entity normalization. The proposed method offers significant time improvement over conventional nxn comparisons performed between named entities. It relies on a novel efficient algorithm that avoids normalizing the new entities against all existing entities. Only a single candidate from the normalized entities is chosen to be checked against each new entity. This allows using extensive normalization checking only with the entity that is most likely to be normalized. Our results show that we obtain comparative results in nearly half the time required by conventional named entity normalization methods. We have also tuned a SVM model that decides whether two entities should be merged or not. This SVM model outperforms the related work in accuracy by 9%. |
Topics |
Methods, tools and procedures for acquisition, creation, management, access, distribution and use of Arabic LRs |
Full paper |
An Optimized Method For Arabic Cross Document Named Entity Normalization |
Bibtex |
@InProceedings{REFAAT09.49,
author = {Khaled S. Refaat and Amgad Madkour},
title = {An Optimized Method For Arabic Cross Document Named Entity Normalization},
booktitle = {Proceedings of the Second International Conference on Arabic Language Resources and Tools},
year = {2009},
month = {April},
date = {22-23},
address = {Cairo, Egypt},
editor = {Khalid Choukri and Bente Maegaard},
publisher = {The MEDAR Consortium},
isbn = {2-9517408-5-9},
language = {english}
} |