Construction  |   Business  |   Government  |   Agriculture  |   Health  |   Education  |   Geolocation  |   Justice  |   Industry  |   Security  |   Language  |   Sport  |   Office  |   Transport
PERSONAL NAMES  |   PLACE NAMES  |   ENTITY NAMES  |   ACRONYMS  |   ENTITY ACRONYMS  |   SEMANTICS  |   ORTHOGRAPHY  |   ONTOLOGY

MAPSSeman® Lite (PoS Tagger) Specifications

Tagger in words

MAPSSeman® PoS Tagger is a deep tagger meaning it provides a powerful tool to tokenize Arabic corpus in multi-syntactic levels that exceed the abilities known for many other functional taggers. Four different input ways suitable for texts of any size. Utilizes compact tokenizer, lemmatizer, and morphological analyzer. Supports more than 12 tagsets with robust built-in tokenizer Really "deep tagging", this professional tool can tag Arabic 5 levels beyond any other commercially (or academic) taggers currently available. Advanced and versatile tagset editor with 12 built-in known tagsets ready for use. Highly customizable UI with many options. Export to XML and Excel worksheet. Built-in XML viewer for already tagged corpus enables content browsing. Helps to manage an error free tagged output for the entire scope of your corpora. Maintains the proper encoding for your multilingual corpus. Enables the complete automation of the tokenization process.

Request a quote
Buy NOW
Download
Evaluation copy
Download MAPSSeman® Lite Tagger Download MAPSSeman® Lite Tagger
Download
User Manual
Download tutorial MAPSSeman® Lite Tagger
  • Features
  • Requirements
  • Performance
  • Screenshots
Overall Description
  • A context-sensitive Arabic tagger suited for big corpus Arabic text.
  • A professional productivity tool for tagging Arabic text; it is hybrid meaning it rather does semantic and syntactic analysis (ordinary taggers work on morphology level with heavy reliance on linguistic principles and lexical characteristcs); the system incorporates a solid lexical analyzer (stemmer) that prepares the text to the morphological analyzer which in turn works on morphotactical and contextual probabilities before tagging any token.
  • The system approaches the issue of part of speech tagging using techniques that go beyond the superficial "Linguistic Mechanics" and string manipulation such as stemming, tokenization, morphological analysis or any other classical techniqes leveraging the ideosyncratic meaning in a completely new technology.
  • Supports tagsets as specified by CG, Brill, Penn Treebank, CLAWS, Brown, LOB, Khoja.
  • Export output to SQL, TXT, for processing purposes; HTM, DOC, PDF, ODT for viewing purposes.
  • This tool is designed to be used with large Arabic corpora, however, many simplified features are added to assist novice users making it ideal for use by academic purposes as well.
  • The system is equipped with a powerful tagset editor so users can edit the built-in tagsets or start compiling a completely new tagset of their own.
  • Arabic text transcription (KATS/IPA) for readibilty for non-speakers.
  • Three Arabic virieties input (classic, MSA, colloquial).
  • Verbose dispay of tagged text include some 10 categories (root, clitics, tense, case, mood, voice, form, gloss, etc.)
  • Sliding window adjustment.
Details (Click on the image to view enlarged version)
Corpus options Mini report generation Single catena entry
Corpus option mini report generation single catena entry
Versatile corpus tuning options Mini report generation. Dual entry option.

Facts

Home » MAPS » MAPS Semantics » MAPSSeman Lite (PoS Tagger) Specifications
Category Software | Reference MSLTAG | Family MAPSSEMANL | Last updated 31/3/2016