Name Romanizer

Synopsis

MAPSOno is the family name of our anthroponomy processing package; a set of specialized modules tuned for applications such as information retrieval, document clustering, named entity extraction (NER), and translation. Output samples are available for download as PDF for your convenience, you are also invited to send us your sample input for processing free of charge, we accept limited size samples in any form or encoding.



MAPSOnoDatabasesSoftware
Personal Names Transcription Arabic Given Names Romanization
Personal Names Romanization System Arabic Surnames Transcription
Personal Names Arabicizing Arabic Whole Names Arabicizing
Personal Names Retrieval Names of Arabic Origins Retrieval
Arabic Names Location Non-Arab Names  
Unique and Indigenous Names  
  Famous Names and Celebrities  

Romanization is a way of reproducing the sound of the words according to the orthography rules of the target language "mainly Latin script based language"; uniform results in the Romanization of non-Latin based script languages are difficult to obtain, since vowel points and diacritical marks -if any- are generally omitted from both manual and machine writing. It follows that for correct identification of the words which appear in any particular name, knowledge of its standard Arabic-script spelling including proper pointing, and recognition of dialectal and idiosyncratic deviations are essential.

In multilingual applications, the problem of corss-lingual name matching is complicated by the fact that when a name is represented in a script different from its native script there may be several alternative representations for each phoneme, leading to large number of potential variants, this is true for Romanized forms of names in many languages such as Arabic. Romanizer does an orthographic transcription by default, a process that can distinguish between multicultural names and apply a sophisticated algorithm to provide the most likely variations. It is neither a simple name matching algorithm nor a name combination generator engine; it rather goes all the way back creating a new legitimate version of the input name token to decide which legitimate variants to return, thus avoiding the elaborate listing of trivial and never-exist variants mostly output by simple combination processes. It also tracks cultural variations through look-up database of refined set of etymologic database to follow exactly the way used by specific culture to Romanize the name.

For Semetic languages such as Arabic which is normally unvocalized, the unvocalized Arabic name does not give sufficient information for accurate romanization, Arabic personal names come without any diacritical marks so a fully accurate vocalized version of the name is necessary for romanization purposes, the system serves as a major pre-processor and provide three optional vocalization modes to chose from as adding vowels to Arabic personal names is critical for any Romanization or transcription process, this module accomplishes the task of disambiguating Arabic personal names; it can be viewed as a front-end processor for the transcription system, this module is name specific, unpredictable output will result if used to vocalize ordinary text

Arabic personal name has no more than one way to spell and this is true for most of the names exist in the language vocabulary; variants occur only when romanizing the Arabic name to other languages that have no one-to-one phonetic relationship with Arabic e.g. all romance languages. As a result of this each Arabic name has many romanized variants in other languages too.

A major problem challenging Arabic text processing systems is the absence of diacritics and vowels (Fat'ħa, Damma, Kasra, Shadda, Sukuun) in almost all the Arabic writings. The information contained in unvocalized orthography of the Arabic names is not sufficient to most applications supposed to accomplish automatic transliteration of Arabic proper nouns to English For instance the names "أحْمَد" /ʔħ'mad/ and "مُحَمَّد" /muħam'mad/ " have the following different transcription pairs respectively :

"Ahmet" and "Muhemet" in Turkey,
"Ahmed" and "Mohamed" in North Africa e.g. Morocco, Algeria, Tunisia, Mauritania, and Egypt
"Ahmad" and "Muhamad" in Gulf region e.g. KSA, Iraq, Oman, etc.

The pairs shown above do not actually tell the whole story since the list of variants of such very common names will be really long without enfocing some sort of rules to limit the output within acceptable phonemic forms. The system is designed to deal with such complexities and fully return all romanized versions of any personal name with such minimal nuances been distinguished and clearly manifested


MAPSOno Romanizer functional diagram
MAPS Name Romanizer
A screenshot of the MAPSOno Romanizer interface, you can view the technical specifications. You may also DOWNLOAD Evaluation copy.

Vocalized: vocalized version of the nameDIN: Deutsches Institut für Normung
UNGEGN: United Nations Group of Experts on Geographical NamesSATTS: Standard Arabic Technical Transliteration System
ALA-LC: American Library Association-Library of CongressCommon: popular Romanization

ID Entry Vocalized UNGEGN ALA-LC DIN SATTS Common
0001 أبوبكر أَبُوبَكْر Abū Bakr Abū Bakr Abu Bakr AEBWBKR Abu Bakr
0002 حزام حَزَام Ḩazām Ḥazām Ḥazām H;AM Hazam
0003 حسام حُسَام Ḩusām Ḥusām Ḥusām HSAM Husam
0004 حسان حَسَّان Ḩassān Ḥassān Ḥassān HSAN Hassan
0005 خولي خُولِي Khūlī Khūlī Ḫūliy OWLI Khuli
0006 خنساء خَنْسَاء Khansā’ Khansā' Ḫansā ONSAE Khansa
0007 خضاب خِضَاب Khiḑāb Khiḍāb Ḫiḍāb OVAB Khidab
0008 جداول جَدَاوِل Jadāwil Jadāwil Ğadāwil JDAWL Jadawil
0009 جيداء جِيدَاء Jīdā’ Jīdā' Ğīdaā JIDAE Jidaa
0010 جدوى جَدْوَى Jadwā Jadwā Ğadwā JDWa/ Jadwa
0011 ثمينة ثَمِينَة Thamīnah Thamīnah Ṯamīnah CMIN? Thaminah
0012 ثمرات ثَمَرَات Thamarāt Thamarāt Ṯamarāt CMRAT Thamarat
0013 ثابت ثَابِت Thābit Thābit Ṯābit CABT Thabit
0014 برهان بُرْهَان Burhān Burhān Burhān BR?AN Burhan
0015 بركات بَرَكَات Barakāt Barakāt Barakāt BRKAT Barakat
0016 بركة بَرَكَة Barakah Barakah Barakah BRK? Barakah
0017 بسمة بَسْمَة Basmah Basmah Basmah BSM? Basmah
0018 إسراء إِسْرَاء Isrā’ Isrā' Isrā ESRAE Israa
0019 إشراف إِشْرَاف Ishrāf Ishrāf Išrāf E:RAF Ishraf
0020 دعد دَعْد Da`d Da’d Daʿd D`D Daad
0021 دعجاء دَعْجَاء Da`jā’ Da’jā' Daʿǧā D`JAE Daja
0022 رمضان رَمَضَان Ramaḑān Ramaḍān Ramaḍān RMVAN Ramadan
0023 رامية رَامِيَة Rāmiyah Rāmiyah Rāmiyah RAMI? Ramyah
0024 سخية سَخِيَّة Sakhiyyah Sakhīyah Saḫīyah SOI? Sakhyah
0025 سديد سَدِيد Sadīd Sadīd Sadīd SDID Sadeed
0026 سحر سَحَر Saḩar Saḥar Saḥar SHR Sahar
0027 أم شادي أُمُّ شَادِي Umm Shādī Umm Shādī Umm Šādy AEM :ADI Um Shady
0028 شادن شَادِن Shādin Shādin Šādin :ADN Shadin
0029 صقر صَقْر Şaqr Ṣaqr Ṣaqr XQR Saqr
0030 ضليع ضَلِيع Ḑalī` Ḍalī’ Ḍalīʿ VLI` Daleea
0031 ضمرة ضَمْرَة Ḑamrah Ḍamrah Ḍamrah VMR? Damra
0032 ظاهرة ظَاهِرَة Z̧āhirah Ẓāhirah Ẓāhirah YA?R? Dhahira
0033 ظريف ظَرِيف Z̧arīf Ẓarīf Ẓarīf YRIF Dhareef
0034 عائشة عَائِشَة `ā’ishah ’ā'ishah ʿāišah `AE:? Aishah
0035 عاطف عَاطِف `āţif ’āṭif ʿāṭif `AUF Atif
0036 عبد المجيب عَبْدُ الْمَجِيب `abdul Mujīb ’abdul Mujīb ʿabdul Muǧīb `BD ALMJIB Abdul Mujeeb
0037 عبد المجيد عَبْدُ الْمَجِيد `abdul Majīd ’abdul Majīd ʿabdul Maǧīd `BD ALMJID Abdul Majeed
0038 عريق عَرِيق `arīq ’arīq ʿarīq `RIQ Areeq
0039 عرفات عَرَفَات `arafāt ’arafāt ʿrafāt `RFAT Arafat
0040 غسان غَسَّان Ghassān Ghassān Ġassān GSAN Ghassan
0041 غصن غُصْن Ghuşn Ghuṣn Ġuṣn GXN Ghusn
0042 غفران غُفْرَان Ghufrān Ghufrān Ġufrān GFRAN Ghufran
0043 فيصل فَيْصَل Fayşal Fayṣal Fayṣal FIXL Faisal
0044 فراس فِرَاس Firās Firās Firās FRAS Firas
0045 كاظمة كَاظِمَة Kāz̧imah Kāẓimah Kāẓimah KAYM? Kazimah
0046 كاعب كَاعِب Kā`ib Kā’ib Kāʿib KA`B Kaaib
0047 كحلاء كَحْلاَء Kaḩlā’ Kaḥlā' Kāḥlāʾ KHLAE Kahlaa
0048 لبابة لُبَابَة Lubābah Lubābah Lubābah LBAB? Lubabah
0049 لبيبة لَبِيبَة Labībah Labībah Labībah LBIB? Labeebah
0050 ماهرة مَاهِرَة Māhirah Māhirah Māhirah MA?R? Mahirah
0051 ماوية مَاوِيَّة Māwīyah Māwīyah Māwīyah MAWI? Mawyah
0052 مائسة مَائِسَة Mā’isah Mā'isah Māʾisah MAES? Maisah
0053 نافع نَافِع Nāfi` Nāfi’ Nāfiʿ NAF` Nafia
0054 هجرس هَجْرَس Hajras Hajras Haǧras ?JRS Hajras
0055 هيفاء هَيْفَاء Hayfā’ Hayfā' Hayfaʾ ?IFAE Hayfa
0056 وليد وَلِيد Walīd Walīd Walīd WLID Waleed
0057 واثق وَاثِق Wāthiq Wāthiq Wāṯiq WACQ Wathiq
0058 يمنى يُمْنَى Yumna Yumnā Yumnā IMN/ Yumna
0059 ياسمينة يَاسَمِينَة Yāsamīnah Yāsamīnah Yāsamīnah IASMIN? Yasminah
0060 ياسر يَاسِر Yāsir Yāsir Yāsir IASR Yasir

Facts

Home » MAPS  » MAPS Onomastics » Name Romanizer
Category Software | Reference MAPSONOROM | Family MAPSONO | Last updated 12/7/2018