MAPSSeman© Lite (PoS Tagger) Specifications

Tagger in words

MAPSSeman© PoS Tagger is an Arabic part of speech tagger meaning it provides a powerful tool to tokenize Arabic corpus in multi-syntactic levels that exceed the abilities known for many other functional taggers.
Four different input ways suitable for texts of any size.
Utilizes compact tokenizer, lemmatizer, and morphological analyzer.
Supports more than 12 tagsets with robust built-in tokenizer
True "multi-pass tagging", this professional tool can tag Arabic 5 levels beyond any other commercially (or academic) taggers currently available.
Advanced and versatile tagset editor with 12 built-in known tagsets ready for use.
Highly customizable UI with many options.
Export to SQL, XML and worksheet.
Built-in XML viewer for already tagged corpus enables content browsing.
Helps to manage an error free tagged output for the entire scope of your corpora.
Maintains the proper encoding for your multilingual corpus.
Enables the complete automation of the tokenization process.

Request a quote
Evaluation copy
Download MAPSSeman© Lite Tagger
User Manual
Download tutorial MAPSSeman© Lite Tagger
  • Features
  • Requirements
  • Performance
  • Screenshots
Overall Description
  • A context-sensitive Arabic tagger suited for big corpus Arabic text.
  • A professional productivity tool for tagging Arabic text; it is hybrid meaning it rather does semantic and syntactic analysis (ordinary taggers work on morphology level with heavy reliance on linguistic principles and lexical characteristcs); the system incorporates a solid lexical analyzer (stemmer) that prepares the text to the morphological analyzer which in turn works on morphotactical and contextual probabilities before tagging any token.
  • The system approaches the issue of part of speech tagging using techniques that go beyond the superficial "Linguistic Mechanics" and string manipulation such as stemming, tokenization, morphological analysis or any other classical techniqes leveraging the ideosyncratic meaning in a completely new technology.
  • Supports tagsets as specified by CG, Brill, Penn Treebank, CLAWS, Brown, LOB, Khoja.
  • Export output to XML, SQL, TXT, for processing purposes; HTML, XLS, PDF for viewing purposes.
  • This tool is designed to be used with large Arabic corpora, however, many simplified features are added to assist novice users making it ideal for use by academic purposes as well.
  • The system is equipped with a powerful tagset editor so users can edit the built-in tagsets or start compiling a completely new tagset of their own.
  • Arabic text transliteration (KATS) for readibilty for non-Arabic speakers.
  • Three Arabic varieties in the input (classic, MSA, colloquial).
  • Verbose dispay of tagged text include some 10 categories (root, clitics, tense, case, mood, voice, form, gloss, etc.)
  • Sliding window adjustment.
Details (Click on the image to view enlarged version)
Corpus options Mini report generation Single catena entry Arabic varieties

Corpus option

mini report generation

single catena entry

input different Arabic varieties
Versatile corpus tuning options Mini report generation Dual entry option Input different Arabic varieties

Home » MAPS » MAPS Semantics » MAPSSeman Lite (PoS Tagger) Specifications
Category Software | Reference MSLTAG | Family MAPSSEMANL | Last updated 8/3/2019