Arabic Named Entity Extractor

Synopsis

Named Entity Recognition "NER" is the process of selecting the most likely sequence of informative lexical items in a sentence. The process determines syntactic and semantic characteristics of the words from unstructured text, such as person, place, organization, date etc. and also classifies them to subcategories according to the taxonomy implemented.

Information


Reference: MNERSYS

Last updated: 15/1/2023

Preview

Kalmasoft NERSys is an Arabic Named Entity Recognition/Extraction tool aimed at preparing Arabic annotated corpora; a context-sensitive rule-based solution utilizing hand-crafted set of comprehensive semantic and syntactic rules to deal with unstructured Arabic texts, the output is an annotated structured XML or JSON formatted corpus but SQL database and TXT are among the other output alternatives. For the purposes of quick review HTML, XLSX, and PDF are also available.

NERSys is designed to prepare Arabic structured datasets since documents of unstructured text are difficult to make use of in their raw nature in NLP applications like MT, IR, Entity linking, Semantic search, and search engines because there is more information there than in the raw text alone.

NERSys also implements advanced classification algorithm to categorize text to more than 20 predefined subject domains.

Arabic Named Entity Extractor

Sample output
Arabic Named Entity Recognition