Construction  |   Business  |   Government  |   Agriculture  |   Health  |   Education  |   Geolocation  |   Justice  |   Industry  |   Security  |   Language  |   Sport  |   Office  |   Transport
PERSONAL NAMES  |   PLACE NAMES  |   ENTITY NAMES  |   ACRONYMS  |   ENTITY ACRONYMS  |   SEMANTICS  |   ORTHOGRAPHY  |   ONTOLOGY

Orthography - Annotated Arabic Corpus

Synopsis

Annotated Arabic copora.
MAPSOrtho Family MembersRelated DatabasesOther software of interest
Arabic Text Diacritizer Arabic Corpus Arabic Part of Speech Tagger
Arabic Noun Inflector Database of Arabic Roots Arabic Text Parser
Arabic Text Stemmer Database of Arabic Stems Geographic Names Romanizer
Arabic Verb Conjugator Database of Loan Words  
Arabic Lemmatizer Database of Loan Terms  
  Database of Colloquial Arabic  
  Database of English/Arabic Entity Names  

A highly precise set of grammatically tagged Arabic corpora, these POS-tagged lexical resources are invaluable for users who perform researches on the corpus, other linguists with other interests and differing perspectives may exploit this resource in many other language desciplines.

 

Raw input text
=====

Unvocalized KATS version
=====

Annotated corpus tex====

Annotated KATS version
=====

Please download more organized samples in TXT format from dbsamples.htm.

Arabic Full-form Monolingual Dictionary

This huge lexicon is automatically generated using Kalmasoft linguistic engines; a very useful tool for research intended for statistical NLP. It generally conforms to MSA with an insight to the most regular content of the whole Arabic language including most dominant grammatical classes, this resource is valuable for purposes include corpus annotation, abstraction, and analysis in general. [300 Million entries]

Entry: Arabic entryClass: Entry grammatical class
Vocalized: Vocalized ArabicSense: Entry sense
KATS: KATS version of field "Vocalized" Kalmasoft KATSType: Grammatical class
IPA: International Phonetic Alphabet

Please download more organized samples in TXT format from dbsamples.htm.

ID Entry Vocalized KATS IPA Class Sense Type
0001
0002


Facts

Home » Databases » Annotated Arabic Corpus
Category Databases | Reference DBCORPUS | Entries 50+ | Last updated 5/01/2015