Arabic text is written unvocalized except for classical themes and Quranic text, this is a major stumbling stone for any NLP system. Kalmasoft diacritizing module is developed to accomplish full and semi-vocalization process of the raw input text.
Arabic noun declension is the process of inflecting nouns to their sub-grammatical categories, MAPS inflects every single Arabic noun to more than dozen of categories including the classes e.g. Verbal Noun, Noun of Instrument, Active Participle, Passive Participle, Noun of Place, Noun of Time and three cases Accusative, Nominative, and Genitive.
Our full-fledged morphological analyzer utilizes a light stemmer which does not only affix removal but also root extraction, it does this using complicated techniques to deal with all forms of the assimilated, hollow, and defect tokens, the morphological analyzer does the pattern recognition necessary to complete the task and returns the correct form of the root or stem.
The Conjugation Generator is a full-form lexical production module built on a root-based algorithm; a root like [ksr] "to break" may be seeded into the module to yield roughly 30,000 conjugations this is theoretically true for any other triconsonantal sound root.