Construction  |   Business  |   Government  |   Agriculture  |   Health  |   Education  |   Geolocation  |   Justice  |   Industry  |   Security  |   Language  |   Sport  |   Office  |   Transport
PERSONAL NAMES  |   PLACE NAMES  |   ENTITY NAMES  |   ACRONYMS  |   ENTITY ACRONYMS  |   SEMANTICS  |   ORTHOGRAPHY  |   ONTOLOGY

Kalmasoft Databases and Glossaries

Synopsis
Kalmasoft maintains and manages a central repository of bilingual dataset of terms which is the largest English-Arabic set of databases, this dataset has been progressively collected since the last decade with a major goal of achieving effective and accurate reference materials to assist in developing Arabic language support for software packages and to boost the ever-growing domains of the new web technologies that related to language engineering and NLP; we keep our data updated and organized in such a way to ensures that only the correct information is always present whenever needed. All materials presented here are commercially available and can be customized or fine-tuned to meet your specific requirements.

 Anthroponyms (Onomasticon)
Database of Arabic Given Names
  • Thousands of Arabic given names in native script with Arabic diacritical marks (short vowels), English transcription, and gender fields; transcription here follows the common way a name is spelled in English but other transcription systems are available too. A separate database of frequency statistics on each name can be supplied upon request.
Database of Arabic Surnames
  • This is perhaps the most interesting database to those involved in developing NER applications or name scoring software packages, same as above with Arabic diacritical marks and full English transcription. A separate frequency statistics on each name can be supplied as a separate database upon request.
Database of Arabic Whole Names
  • A database of millions of real world Arabic names collected from many sources and supplied with gender and locale fields covering the entire Arabic region as well as additional three countries known to be under strong influence of Arabic culture.
Database of Names of Arabic Origins
  • Few hundreds of names of Arabic origins mostly from counties known to have been under the umbrella of Islamic culture e.g. Turkey, India, Spain, Persia and few African countries.
Database of Foreign Names
  • Names from all over the world, what is new in this database is Arabic transcriptions which are added to every English name, the gender field is also added, most of the records are showing additional information e.g. locale and meaning.
Database of Unique and Indigenous Names
  • Unique and indigenous names from all Arabic speaking countries.
Database of Famous Names and Celebrities
  • Famous Names and Celebrities from 200 countries.

 Entities
English-Arabic Database of Entity Names
  • Full suite of bilingual databases covering almost all aspects of life e.g. sports, politics, science, and more; each database may have additional fields e.g. "type" but ,basically, all have the "locale" field present.
English-Arabic Database of Arabic Entity Keywords
  • Unique and valuable, this is a database of all entity keywords found in Arabic, it also comprises the Arabic counterparts of entity keywords like company, society, union, factory committee and other terms, it is very useful for NER applications, web crawlers, search engines and CLIR applications.
English-Arabic Database of Street Names
  • First of its kind, this database of odonyms (street names) long been awaited now available in Arabic; world street names of more than two million geographical entities; very useful for information retrieval systems e.g. NER applications, web crawlers, search engines and CLIR applications.
Database of Arabic Colloquial Entity Names
  • A valuable listing of indigenous and rare names found in Arabic countries, the biggest database of its kind now available electronically.

 Acronyms and Initialisms
English-Arabic Database of Acronyms and Abbreviations
  • Thousands of acronyms and abbreviations cover many fields like aviation, aerospace, military, sports, education, science, engineering, media, law, recreation and entertainment, and more.

 Toponyms (Place names)
English-Arabic Database of Arabic Place Names
  • Highly organized gazetteer (populated places only) of thousands of Arabic place names ready for publishing on the internet with many Arabic transcription systems.
English-Arabic Database of World Place Names
  • World gazetteer (populated places only) database with multiple information including latitude and longitude in DMS format together with feature type and other minor fields; Arabic transcription of the feature's name is added for each place name in this database; this is very useful for localized software, entity scoring applications, navigation software, other geographical software.
English-Arabic World Gazetteer
  • World gazetteer is a full featured database of geographical information concerning the geographical makeup of all world countries and natural physical features, such as mountains, waterways, or roads. This database is a complement to the above two databases.
English-Arabic Database of Geographical Terms
  • Most of geographical terms can be found here, this database is compiled to be used with electronic dictionaries and MT applications.
English-Arabic Database of Famous Places
  • This is part of the above database, it has all common geographic features e.g. valley, creek, summit etc. as well as some world famous features including oceans, continents and major cities.

 Fauna and Flora
English-Arabic Database of Domestic Animal Names
  • under construction
English-Arabic Database of Domestic Plant Names
  • under construction

 Orthographic Databases
Arabic Corpus
  • Tagged Arabic corpus encoded either in UTF-8, Windows 1256, or in Kalmasoft generic transliteration system "KATS"; essential for MT application based on statistical techniques, and as a reference for POS taggers and text parsers.
Database of Arabic Roots
  • Extended database of Arabic roots; the database is in two forms in native script coded using either UTF-8 or Windows 1256 coding or in Kalmasoft native transliteration system "KATS" which is using ASCII characters to facilitate text processing; this is essential for every root-based Arabic processing system in particular POS taggers and inflection generation systems.
Database of Arabic Full-form Verbs
  • Arabic full-form verbs that actually found in ordinary running text, this database includes all regular conjugated verbs.
Database of Arabic Full-form Nouns
  • Arabic full-form nouns that actually found in ordinary running text, this database includes all regular inflected nouns.
Monolingual Database of Arabic Stems
  • Presented in native script using UTF-8 or Windows 1256 coding or in Kalmasoft native transliteration system "KATS" which is using plain ASCII characters, this database is a major asset for many kinds of NLP applications, usages include conjugation generators, spell checkers and more.
Bilingual Database of Loan Words in Arabic
  • Full information about 5,000+ of loanwords of multiple origins including English, French, Turkish, etc. coded in native Arabic script and Kalmasoft "KATS" with their possible original parallels, this database is good for text abridging, parser and other kinds of NLP applications.
Bilingual Database of Loan Terms in Arabic
  • Full information about 50,000+ of loanterms of multiple origins including English, Spanish, Italian, French, Turkish, etc. coded in native Arabic script and Kalmasoft "KATS" with their possible original parallels, this database is good for text abridging, parser and other kinds of NLP applications.

 Semantic Databases
Database of Arabic Idiomatic Expressions
  • Hundreds of Arabic idiomatic expressions with their meanings and English parallels; important for MT applications.
Database of Arabic Newspaper Expressions
  • Thousands of Arabic newspaper expressions with their meanings and English parallels; important for MT applications.
Database of Arabic Proverbs
  • Thousands of Arabic proverbs with their English parallels; important for MT and TMM.

 Ontology and Semantic Databasess
Database of Arabic Noun Ontology
  • Arabic Noun ontology database (under construction).
Database of Arabic Verb Ontology
  • Arabic verb ontology database (under construction).

 Taxonomy
Database of Arabic Noun Taxonomy
  • Arabic Noun taxonomy database (under construction).

Facts

Home » Databases and Glossaries
Category Dictionaries | Reference DBASES | Entries 200,000,000+ | Last updated 3/01/2015