MAPSOno is the family name of our anthroponomy processing package; a set of specialized modules tuned for applications such as information retrieval, document clustering, named entity extraction (NER), and translation. Output samples are available for download as PDF for your convenience, you are also invited to send us your sample input for processing free of charge, we accept limited size samples in any form or encoding.

Romanization is a way of reproducing the sound of the words according to the orthography rules of the target language "mainly Latin script based language"; uniform results in the Romanization of non-Latin based script languages are difficult to obtain, since vowel points and diacritical marks -if any- are generally omitted from both manual and machine writing. It follows that for correct identification of the words which appear in any particular name, knowledge of its standard Arabic-script spelling including proper pointing, and recognition of dialectal and idiosyncratic deviations are essential.

In multilingual applications, the problem of corss-lingual name matching is complicated by the fact that when a name is represented in a script different from its native script there may be several alternative representations for each phoneme, leading to large number of potential variants, this is true for Romanized forms of names in many languages such as Arabic. Romanizer does an orthographic transcription by default, a process that can distinguish between multicultural names and apply a sophisticated algorithm to provide the most likely variations. It is neither a simple name matching algorithm nor a name combination generator engine; it rather goes all the way back creating a new legitimate version of the input name token to decide which legitimate variants to return, thus avoiding the elaborate listing of trivial and never-exist variants mostly output by simple combination processes. It also tracks cultural variations through look-up database of refined set of etymologic database to follow exactly the way used by specific culture to Romanize the name.

For Semetic languages such as Arabic which is normally unvocalized, the unvocalized Arabic name does not give sufficient information for accurate romanization, Arabic personal names come without any diacritical marks so a fully accurate vocalized version of the name is necessary for romanization purposes, the system serves as a major pre-processor and provide three optional vocalization modes to chose from as adding vowels to Arabic personal names is critical for any Romanization or transcription process, this module accomplishes the task of disambiguating Arabic personal names; it can be viewed as a front-end processor for the transcription system, this module is name specific, unpredictable output will result if used to vocalize ordinary text

Arabic personal name has no more than one way to spell and this is true for most of the names exist in the language vocabulary; variants occur only when romanizing the Arabic name to other languages that have no one-to-one phonetic relationship with Arabic e.g. all romance languages. As a result of this each Arabic name has many romanized variants in other languages too.

A major problem challenging Arabic text processing systems is the absence of diacritics and vowels (Fat'ħa, Damma, Kasra, Shadda, Sukuun) in almost all the Arabic writings. The information contained in unvocalized orthography of the Arabic names is not sufficient to most applications supposed to accomplish automatic transliteration of Arabic proper nouns to English For instance the names "أحْمَد" /ʔħ'mad/ and "مُحَمَّد" /muħam'mad/ " have the following different transcription pairs respectively :

"Ahmet" and "Muhemet" in Turkey,
"Ahmed" and "Mohamed" in North Africa e.g. Morocco, Algeria, Tunisia, Mauritania, and Egypt
"Ahmad" and "Muhamad" in Gulf area e.g. KSA, Iraq, Oman, etc.

The pairs shown above do not actually tell the whole story since the list of variants of such very common names will be really long without enfocing some sort of rules to limit the output within acceptable phonemic forms. The system is designed to deal with such complexities and fully return all romanized versions of any personal name with such minimal nuances been distinguished and clearly manifested

MAPSOno Romanizer functional diagram
MAPS Name Romanizer
A screenshot of the MAPSOno Romanizer interface, you can view the technical specifications. You may also DOWNLOAD Evaluation copy.

Vocalized: vocalized version of the nameDIN: Deutsches Institut für Normung
UNGEGN: United Nations Group of Experts on Geographical NamesSATTS: Standard Arabic Technical Transliteration System
ALA-LC: American Library Association-Library of CongressCommon: popular Romanization

ID Entry Vocalized UNGEGN ALA-LC DIN SATTS Common
0001 أبوبكر أَبُوبَكْر Abū Bakr Abū Bakr Abu Bakr AEBuWBaKoR Abu Bakr
0002 حزام حَزَام Ḩazām Ḥazām Ḥazām Ha;aAM Hazam
0003 حسام حُسَام Ḩusām Ḥusām Ḥusām HuSaAM Husam
0004 حسان حَسَّان Ḩassān Ḥassān Ḥassān HaS*aAN Hassan
0005 خولي خُولِي Khūlī Khūlī Ḫūliy OuWLiI Khuli
0006 خنساء خَنْسَاء Khansā’ Khansā' Ḫansā OaNoSAE Khansa
0007 خضاب خِضَاب Khiḑāb Khiḍāb Ḫiḍāb OiVaAB Khidab
0008 جداول جَدَاوِل Jadāwil Jadāwil Ğadāwil JaDaAWiL Jadawil
0009 جيداء جِيدَاء Jīdā’ Jīdā' Ğīdaā JiIDaAE Jidaa
0010 جدوى جَدْوَى Jadwā Jadwā Ğadwā JaDoWa/ Jadwa
0011 ثمينة ثَمِينَة Thamīnah Thamīnah Ṯamīnah CaMiINa@ Thaminah
0012 ثمرات ثَمَرَات Thamarāt Thamarāt Ṯamarāt CaMaRaAT Thamarat
0013 ثابت ثَابِت Thābit Thābit Ṯābit CaABiT Thabit
0014 برهان بُرْهَان Burhān Burhān Burhān BuRo~aAN Burhan
0015 بركات بَرَكَات Barakāt Barakāt Barakāt BaRaKaAT Barakat
0016 بركة بَرَكَة Barakah Barakah Barakah BaRaKa@ Barakah
0017 بسمة بَسْمَة Basmah Basmah Basmah BaSoMa@ Basmah
0018 إسراء إِسْرَاء Isrā’ Isrā' Isrā EiSoRaAE Israa
0019 إشراف إِشْرَاف Ishrāf Ishrāf Išrāf Ei:oRaAF Ishraf
0020 دعد دَعْد Da`d Da’d Daʿd Da`oD Daad
0021 دعجاء دَعْجَاء Da`jā’ Da’jā' Daʿǧā Da`oJAE Daja
0022 رمضان رَمَضَان Ramaḑān Ramaḍān Ramaḍān RaMaVaAN Ramadan
0023 رامية رَامِيَة Rāmiyah Rāmiyah Rāmiyah RaAMiIa@ Ramyah
0024 سخية سَخِيَّة Sakhiyyah Sakhīyah Saḫīyah SaOiI*a@ Sakhyah
0025 سديد سَدِيد Sadīd Sadīd Sadīd SaDiID Sadeed
0026 سحر سَحَر Saḩar Saḥar Saḥar SaHaR Sahar
0027 أم شادي أُمُّ شَادِي Umm Shādī Umm Shādī Umm Šādy AEuM*u :aADiI Um Shady
0028 شادن شَادِن Shādin Shādin Šādin :aADiN Shadin
0029 صقر صَقْر Şaqr Ṣaqr Ṣaqr XaQoR Saqr
0030 ضليع ضَلِيع Ḑalī` Ḍalī’ Ḍalīʿ VaLiI` Daleea
0031 ضمرة ضَمْرَة Ḑamrah Ḍamrah Ḍamrah VaMoRa@ Damra
0032 ظاهرة ظَاهِرَة Z̧āhirah Ẓāhirah Ẓāhirah YaA~iRa@ Dhahira
0033 ظريف ظَرِيف Z̧arīf Ẓarīf Ẓarīf YaRiIF Dhareef
0034 عائشة عَائِشَة `ā’ishah ’ā'ishah ʿāišah `aAEi:a@ Aishah
0035 عاطف عَاطِف `āţif ’āṭif ʿāṭif `aAUiF Atif
0036 عبد المجيب عَبْدُ الْمَجِيب `abdul Mujīb ’abdul Mujīb ʿabdul Muǧīb `aBoDu ALoMuJiIB Abdul Mujeeb
0037 عبد المجيد عَبْدُ الْمَجِيد `abdul Majīd ’abdul Majīd ʿabdul Maǧīd `aBoDu ALoMaJiID Abdul Majeed
0038 عريق عَرِيق `arīq ’arīq ʿarīq `aRiIQ Areeq
0039 عرفات عَرَفَات `arafāt ’arafāt ʿrafāt `aRaFaAT Arafat
0040 غسان غَسَّان Ghassān Ghassān Ġassān GaS*aAN Ghassan
0041 غصن غُصْن Ghuşn Ghuṣn Ġuṣn GuXoN Ghusn
0042 غفران غُفْرَان Ghufrān Ghufrān Ġufrān GuFoRaAN Ghufran
0043 فيصل فَيْصَل Fayşal Fayṣal Fayṣal FaIoXaL Faisal
0044 فراس فِرَاس Firās Firās Firās FiRaAS Firas
0045 كاظمة كَاظِمَة Kāz̧imah Kāẓimah Kāẓimah KAYiMa@ Kazimah
0046 كاعب كَاعِب Kā`ib Kā’ib Kāʿib KA`iB Kaaib
0047 كحلاء كَحْلاَء Kaḩlā’ Kaḥlā' Kāḥlāʾ KaHoLaAE Kahlaa
0048 لبابة لُبَابَة Lubābah Lubābah Lubābah LuBaABa@ Lubabah
0049 لبيبة لَبِيبَة Labībah Labībah Labībah LaBiIBa@ Labeebah
0050 ماهرة مَاهِرَة Māhirah Māhirah Māhirah MaA~iRa@ Mahirah
0051 ماوية مَاوِيَّة Māwīyah Māwīyah Māwīyah MaAWiI*a@ Mawyah
0052 مائسة مَائِسَة Mā’isah Mā'isah Māʾisah MaAEiSa@ Maisah
0053 نافع نَافِع Nāfi` Nāfi’ Nāfiʿ NaAFi` Nafia
0054 هجرس هَجْرَس Hajras Hajras Haǧras ~aJoRaS Hajras
0055 هيفاء هَيْفَاء Hayfā’ Hayfā' Hayfaʾ ~aIoFaAE Hayfa
0056 وليد وَلِيد Walīd Walīd Walīd WaLiID Waleed
0057 واثق وَاثِق Wāthiq Wāthiq Wāṯiq WaACiQ Wathiq
0058 يمنى يُمْنَى Yumna Yumnā Yumnā IuMoNa/ Yumna
0059 ياسمينة يَاسَمِينَة Yāsamīnah Yāsamīnah Yāsamīnah IaASaMiINa@ Yasminah
0060 ياسر يَاسِر Yāsir Yāsir Yāsir IaASiR Yasir


