Construction  |   Business  |   Government  |   Agriculture  |   Health  |   Education  |   Geolocation  |   Justice  |   Industry  |   Security  |   Language  |   Sport  |   Office  |   Transport
PERSONAL NAMES  |   PLACE NAMES  |   ENTITY NAMES  |   ACRONYMS  |   ENTITY ACRONYMS  |   SEMANTICS  |   ORTHOGRAPHY  |   ONTOLOGY

Arabic Semantic Processing System (MAPSSeman®)

Synopsis

MAPSSeman® is the family name of our Arbic semantics processing package; a set of specialized modules tuned for applications such as information retrieval, document clustering, rule-based machine translation (RBMT), example based machine translation (EBMT) and many other applications.
MAPSSeman Family MembersRelated DatabasesOther software of interest
Arabic Text Parser Arabic Corpus Arabic Text Stemmer  
Arabic Part of Speech Tagger Database of Arabic Roots Arabic Verb Conjugator  
Arabic Ontology Processor Database of Arabic Stems Arabic Text Diacritizer
  Database of Loan Words Arabic Noun Inflector
  Database of Loan Terms Personal Names Retrieval System
  Database of Colloquial Arabic Geographical Names Romanizer
  Database of English/Arabic Entity Names  

 Arabic Text Parser
Parsing is a key to accurate translation; once text is correctly disassembled it is much easier to transfer to a different (or simplified) language. Kalmasoft is in the process of developing a unique high-end Arabic parser, which can correctly analyze natural text, represent it as abstract elements and relationships, and then seed it to generate text in a new language. Kalmasoft's parser which utilizes a bottom-up parsing technique is language dependent for now, however, migration is possible requiring only a different rule set and number of lexicons for each additional language. Please refer to Arabic Text Parser for details.

MAPS orthographic processor
A screenshot of the program showing the output interface, you can view the technical specifications.

 Arabic PoS Tagger
Part of speech tagging is the process of selecting the most likely sequence of syntactic categories for the words in a sentence. It determines grammatical characteristics of the words, such as part of speech, grammatical number, gender, person, etc. In the case of Arabic language, this task is not trivial since most of the words are ambiguous as a result of the absence of vowels.

For each word, we want at a minimum to identify its main lexical category (noun, verb etc.) and inflectional features if any (plural, past tense etc.). We might also identify some quasi-semantic features (proper noun) or even specify a word sense relative to some lexicon.

Kalmasoft's PoS tagger returns a syntax free solutions fore each token through extensive set of rules, the output is in XML format but CSV listing is also possible.

A tagged corpus is more useful than an untagged corpus because there is more information there than in the raw text alone. Once a corpus is tagged, it can be used to extract information from the corpus. This can then be used for creating dictionaries and grammars of a language using real language data. Tagged corpora are also useful for detailed quantitative analysis of text.

Please refer to this link Tag-set for a list of Arabic corpus tag set. You may also get a sample of our tagged corpus from this page Arabic corpus. Please refer to Arabic Part of Speech Tagger for details.

MAPS semantic processor
A screenshot of the program showing the output interface, you can view the technical specifications.

 Arabic Ontology Processor
This module is currently being developed. Please refer Arabic Ontology Processor for details.


Facts

Home » MAPS » MAPS Semantic Processing System
Category Software | Reference MAPSSEMANL | Family MAPSSEMAN | Last updated 30/6/2011