Title |
Sarrif! - The Elegant Arabic Morphology Parser |
Authors |
Suhel Jaber and Rodolfo Delmonte |
Abstract |
In this paper we present Sarrif, our Arabic Morphology Parser, featuring a novel approach to the description of Arabic morphology with 2-tape finite state transducers, based on a particular and systematic use of the operation of composition in a way that allows for incremental substitutions of concatenated lexical morpheme specifications with their surface realization for non-concatenative processes (the case of Arabic templatic interdigitation and non-templatic circumfixation). We argue that (1) the method of incremental substitutions through compositions allows for an elegant description of all main morphological processes present in natural languages including non-concatenative ones in strict finite-state terms, without the need to resort to extensions of any sort; (2) our approach allows for the most logical encoding of every kind of dependency, including traditional long-distance ones (mutual exclusiveness), circumfixations and idiosyncratic root and pattern combinations; (3) a smart usage of composition such as ours allows for the creation of a same system that can be easily accomodated to fulfil the duties of both a stemmer (or lexicon development tool) and a full-fledged lexical transducer. |
Topics |
Taggers and Parsers |
Full paper |
Sarrif! - The Elegant Arabic Morphology Parser |
Bibtex |
@InProceedings{JABER09.42,
author = {Suhel Jaber and Rodolfo Delmonte},
title = {Sarrif! - The Elegant Arabic Morphology Parser},
booktitle = {Proceedings of the Second International Conference on Arabic Language Resources and Tools},
year = {2009},
month = {April},
date = {22-23},
address = {Cairo, Egypt},
editor = {Khalid Choukri and Bente Maegaard},
publisher = {The MEDAR Consortium},
isbn = {2-9517408-5-9},
language = {english}
} |