URDU on the MACbyKamal Abdali |
The Mac OS X is capable of editing and word processing in Urdu.
In a few simple steps, you can enable your Mac to handle Urdu documents.
This page could as well have been entitled "Urdu and Persian on the Mac", because the information given here can also be used to compose Persian (Farsi) documents. The keyboard, fonts, and explanations below apply equally to Persian, but the explanations are illustrated with Urdu words. The keyboard and fonts also suffice for Punjabi (Shahmukhi or Pakistani style, written in the Arabic script). But the keyboard does not have all the letters of Sindhi and Pushto alphabets.
Please email your enquiries, comments, criticisms, and suggestions to me at k.abdali@acm.org
If a US flag was not previously visible at the top right of
the screen (on the menu bar), it should now be displayed.
This is the Keyboard menu.
If you click on this flag, a menu will appear underneath it with
various icons and names representing all the active keyboards.
A keyboard named UrduPhonetic and an icon (UrduPhonetic Keyboard Icon)
somewhat like the flag of Pakistan should now appear.
If you click on it, then the keyboard icon on the top right of
the screen will turn into the Pakistani flag, and any keys pressed
on the Mac keyboard will produce Urdu characters.
You can switch between keyboards by clicking on the flags
(the keyboard icons).
On a Mac, it is best to use Naskh fonts (which are typically used in Sindhi, Arabic and Persian publications), not the Nastaleeq fonts (which are used in most Urdu newspapers). Although Nastaleeq fonts are available for the Mac, they don't work as robustly as the Naskh fonts. But if you want to experiment with Nastaleeq fonts anyway, they are discussed in the section Nastaleeq Fonts further below.
Knut Vikor's excellent page The Arabic Macintosh has a detailed discussion of various Arabic Fonts with interesting information about them and links to obtain them.
The Mac OS comes with only one Naskh font, Geeza Pro, which can be used for Urdu, but the characters do not look particularly nice. Two free Naskh fonts for preparing pleasant-looking Urdu documents can be downloaded from SIL International.

Your Mac is now ready for handling documents in Urdu.
The UrduPhonetic keyboard layout that you have downloaded has been designed to closely resemble the phonetic keyboard of InPage, a popular commercial desktop publishing application for Urdu that runs under Windows.
The advantage of a phonetic keyboard is that you easily remember most of the keys (e.g., the key "b" for the Urdu letter "bay", "p" for "pay", "k" for "kaaf", "g" for "gaaf", etc.). As we do not have enough keys on the standard computer keyboards to assign to all Urdu characters, we need to use shifted keys for some (e.g., "shift-k" for "khay", "shift-g" for "ghain", etc.)
In addition to being phonetic, this is also a Unicode keyboard layout. Whatever you type is converted to its Unicode representation which is the modern universal character encoding used in computers for multi-lingual texts.
The Keyboard menu has an item Show Keyboard Viewer. If you select this item, then the system will display a small picture of the keyboard on the screen. In this picture you can see what character each key corresponds to. The characters will change appropriately if you press the shift key or option key (or another modifier key) or select a different keyboard from the Keyboard menu.
The keyboard viewer picture is quite small, and the characters on the keys are hard to see. So below are larger pictures of the UrduPhonetic Keyboard showing the characters corresponding to the keys in plain, shift, option, and option-shift modes. (If you like, you can save these pictures by CTRL-clicking or right-clicking on the pictures and selecting the Save Image As... item from the menu that pops up. You can then print the keyboard pictures for reference.)




Even though we have tried to make the keyboard layout as phonetic
as possible, the mismatch between the Urdu alphabet and the available
keys on a Western keyboard has forced us to make some unintuitive
mapping between letters and keys.
But with a little practice you should be able to type most letters
from memory.
The above pictures contain all the information that there is to give about the UrduPhonetic key assignments. But for quick reference here are some tables of useful key bindings.



An email message is usually a very simple document. If you compose your email on a Mac, and your recipient is also going to read your email on a Mac, then you can try writing your messages in Urdu. In fact, the messages are readable even on any non-Mac machine that has been configured with system options for right-to-left languages and on which appropriate fonts have ben installed. An occasional character in these messages might be undecipherable, and might be replaced with its unicode icon (or some gibberish, in the worst case).
Both Gmail and YahooMail systems work admirably when the Urdu Phonetic input is turned on and message composition is in Rich Text format with right-to-left text direction. With Hotmail, mixing right-to-left and left-to-right text in the same line seems problematic, as that seems to spoil the correct sequencing of words. In each case, the cursor behavior is a bit erratic; the cursor sometimes shows up at the right end of the line instead of being at left next to the last word typed. But you can ignore all that since your text is still set correcly. The cursor behavior will likely be fixed by Apple and email system producers anyway.
It helps to set the Web browser preference for Default Character Encoding to Unicode (UTF-8). If you use the Firefox Web browser, then in its preferences set all Fonts for Arabic to Scheherazade-AAT, Monospace Font Size to 14, and other sizes to 24. The Safari Web browser does not allow language by language font control, so just set Standard Font to Scheherazade-AAT 24.
Gmail allows specifying Urdu as the application language. In that case all the menus, titles, warnings, etc., will be translated to Urdu. But you do not need this extreme setting in order just to read and write Urdu email messages. While keeping English as the system working language, you can type Urdu text by selecting the UrduPhonetic input. Of course, you can also intersperse texts in Urdu and English by simply switching back and forth between UrduPhonetic and English keyboards.
For the Mac, there are two free office suites OpenOffice and NeoOffice, each of which includes a powerful word processor that can be used to edit Urdu doocuments. OpenOffice is a multi-platform open-source software application, with a Mac-specific version called OpenOffice Aqua. NeoOffice is another Mac-specific implementation of OpenOffice. If you want to install one or both of them, you can find detailed descriptions and installation instructions on their official Web sites OpenOffice and NeoOffice. We will not describe their use.
A highly recommended editor is Bean for which the general information and download instructions can be found on its official Web site. This is a free, easy to use, small, and very efficient word processor. Bean is particularly suitable for typesetting documents with Nastaleeq fonts, as we discuss in the section Nastaleeq Fonts below.
The Mac's built-in application TextEdit is good enough for simple documents. TextEdit is considered a text editor rather than a word processor. Yet it can be used for composing documents with multilingual text, embedded graphics, tables, and other advanced features typically found only in large, expensive software applications. Its advantage is that it doesn't need to be installed: It is always there, and is the Mac's default editor for text files.
In Plain Text mode, TextEdit allows only a single font and a single paragraph justification style for the entire document. In Rich Text mode, you can mix various font families, font sizes, font styles (e.g., bold, outlined, shadowed), and justifications (e.g., centered text, or text justified at left or right or both sides). You need to use Rich Text since Plain Text does not work well with Urdu.
To start a new Urdu document, select the menu item File > New, then Format > Text > Writing Direction > Right to Left. Set the Keyboard menu (Top Right) to Urdu Phonetic. Choose fonts, font styles, size, colors, etc., as is usual with most word processors. Since the default formatting is Plain Text, switch to the Rich Text Format by doing Format > Make Rich Text. Now you can apply a different justification to each paragraph, and a different formatting style to each selection.
It is helpful to also do Format > Font > Show Fonts. This puts on the screen a font palette which is convenient for choosing font family (e.g, Scheherazade or Lateef), size, color, etc. The system seems to unpredictably switch the font sometimes to Geeza Pro (the "system default font" for the Arabic script). So you need to watch the font palette and, if necessary, change the font back to what you want it to be.
You can configure TextEdit to use the Scheherazade font as default. To do this, start TextEdit, and on the Menu bar click on TextEdit, then on Preferences, and then on the New Document tab. If the Rich Text radio button is not active, click on it. Click on the Change... button next to Rich text font: . The font dialog will open. Select Scheherazade-AAT in the family column, and 24 in the size column, then close the dialog. Finally, close Preferences, and quit TextEdit. When you restart TextEdit, it will use the Rich Text format and the Scheherazade size 24 font as the default for new documents.
Here is an image of a portion of the Mac screen during the editing of an Urdu document using TextEdit.

In Mac OS 10.5 and above, it is possible, with some care, to use Nastaleeq fonts with TextEdit and Bean. Some freely available Nastaleeq fonts are:
Try Urdu Web to obtain the last three fonts. Beware that the InPage company alleges that some freely available Nastaleeq fonts are pirated from their work.
Note: To install a font file, copy it to /Library/Fonts. On Mac OS 10.6, you can also install a font file by double clicking on it, then clicking on the Install Font button in the dialog box presented to you.
CAUTION: At present, Bean is the best editor to use with Nastaleeq fonts. Nastaleeq does not work at all in OpenOffice and NeoOffice. The former displays in isolated form the characters that are specified to be in any Nastaleeq font. The latter replaces all Nastaleeq fonts with the Naskh Geeza Pro font. The behavior of Nastaleeq is quite erratic in TextEdit also.
OBSERVATION: Nastaleeq works satisfatorily with TeX, discussed below in the section Typesetting Using TeX, LaTeX, XeTeX.
Here are some conculsions from experiments with the above fonts using TextEdit under MacOS 10.6:
Summary: Nafees is the only satisfactory font. While Jameel and Faiz have very nice quality, they fail to render many typed words, and can be quite annoying at times. Pak is not acceptable at all in its current state because of its letter shapes. Fajer is too unstable to be used with confidence, which is a pity because its shapes are nice and the failures are limited to just a few ligatures.
Details:
By comparison with Nastaleeq fonts, Naskh fonts work much better on the Mac, and editing with them is generally trouble-free.
Here is an image of the TextEdit window during the eding of a document with the Jameel Noori Nastaleeq font (size 48 for the title, size 36 for the author's name, size 24 for the text body).

InPage is a commercial desktop publishing application for Windows. It is widely used by publishing houses for producing Urdu publications because of its rich feature set, multi-lingual and multi-script capabilities, and robustness. Until recently, it was one of very few applications that could produce high-quality Nastaleeq documents.
Unfortunately, InPage works only on Windows and, moreover, uses proprietary document structure and fonts. Naturally, there is much interest in converting InPage files into alternate, more portable versions that could be processed on multiple computing platforms with multiple applications. So several online tools and programs have become available to convert Inpage files to Unicode text files. In fact, InPage itself has a unicode coversion facility of sorts through its copy and paste Edit menu items.
The coversion programs that I tried turned out to have errors or limitations, so I had to write one myself. This is a command line application in universal binary code, and works on both PowerPC and Intel macs. It requires Mac OS version 10.4 or higher. As I do not have access to any documentation of the inner structure of InPage documents, this program is based on guess work. Although I have run it successfully on several large documents, please use it at your own risk. Also, since it is a command line application (i.e., a shell command), you need some rudimentary Unix skills to use it productively. Please let me know if you encounter any bugs. Here are the instructions for the simplest way to use it:
./InpToUniTxt story.inp story.txt
and press Enter.
CAUTION: If your files or the saved application are in different
directories, then make sure to use the right path for each file.
Please be aware that the InPage document's formatting properties
(justifications, fonts sizes and styles, colors, etc.) are lost during
the conversion.
The conversion mainly consists of text extraction.
The term TeX is used here in a generic sense for TeX or any of its derivatives, such as LaTeX, AMSTeX, ArabTeX, XeTeX, ArabXeTeX, etc. But when the discussion is about a particular derivative, that system is mentioned by name, e.g., XeTeX.
When you use a word processing system, you take formatting actions yourself, and the system keeps displaying the document as it changes in response to your actions. When you use TeX, you put the document contents (text, images, etc.) and the formatting instructions together in one or more tex files, and the system processes them to produce the desired document. The tex files you prepare constitute a TeX program. The TeX system executes this program to produce the desired document, typically in the form of a PDF file.
TeX has a steep learning curve, but once mastered it allows you to produce very complex, high-quality documents, and provides you very fine control over the look and feel of the document. TeX is widely used for producing scholarly works, and many scientific journals and conferences require that articles be submitted to them in the form of tex files.
Various distributions of TeX are available for the Mac. We highly recommend the TeX Live distribution together with the TeXShop graphics environment for using TeX. For this, download and install the MacTeX package. The download, installation, and usage instructions are on the TeXShop Web site. The MacTeX package includes most of the components needed for processing Urdu documents. In particular, it includes the system called XeTeX which currently offers the best facilities for Urdu.
XeTeX has overcome two limitations that were not satisfactorily addressed by the previously existing derivatives of TeX, and that, in particular, greatly hampered the production of Urdu documents with TeX:
TeXShop makes it very easy to edit and execute TeX programs. When you lauch TeXShop, it will bring up a window in which you can edit your TeX program. But you should first set TeXShop's Preferences. The most important preferences are in the Typesetting tab. Click to open this tab, set Default Command to XeLaTeX, and set Default Script to Pdftex. You can change the Typesetting command to LaTeX or other choices on the editing window itself. Once the Preferences are taken care of, you are ready to edit your program. The editor is powerful, and yet very simple and intuitive. Once you program is complete, you should click on the Typeset button. TeXShop will execute your program, and will display the resulting PDF file if the program ran successfully. It will also bring up a Console window with progress and error messages.
We will not provide any information about TeX, LaTeX, and XeTeX beyond the above descriptions. You have to learn these on your own! Also, you should be able to open a Console window. For this, open a Finder window, go in the Applications folder, then in the Utilities folder, and click on Console.app. (It is advisable to drag Console.app to the Dock, so an alias to it is available to you all the time.). The other command you will need frequently is to change directory (using the cd command) to the folder of your tex files.
We now give an example of typesetting an Urdu ghazal using XeTeX. The first image below shows the TeX program, poem-ra.tex, composed via the editor Bean. XeTeX (actually the program xelatex) processes this file to produce the PDF file poem-ra.pdf, shown in the next image. Ignore the red underlining in the tex window. Bean's spell checker has flagged the tex commands as they are not proper English words!
The bulk of the poetry formatting is done by the bidipoem package written by Vafa Khalighi ( وفا خلیقی ) (see CAUTION below). Note how simple the TeX code is in this case.
Some special steps are needed to get the typesetting done. It is unfortunately necessary to sidestep the TeXShop system. TeXShop is an excellent program for editing tex files and running TeX on them as long as the file contains ASCII characters (the one you can type from the standard English keyboard of your Mac). You can also change to the UrduPhonetic keyboard, and type Urdu characters. They will be displayed in the TeXShop window correctly as typed, and will also appear connected into Urdu words as normal. But once you save and close the file, then on reopening that file the Urdu characters would disappear and would be replaced by undecipherable characters.
Since at present TeXShop cannot save Urdu characters, it is better to use an editor like Bean to prepare tex files. As TeX commands are in English and most of your text will be in Urdu, you can just keep switching the Mac keyboard layout between U.S. and UrduPhonetic. To protect yourself against accidental loss of data, make it a habit to save your work periodically. Launch Bean and start the edit by clicking on New under the File menu. After typing a line or so, choose Save As… from the File menu. In the Save dialog, navigate to the folder where you want to keep tex files, say texstuff in your Home folder. Then click on the File Format: menu, and from the drop-down list select the item Text (you provide extension). Then in the SaveAs: box, type the name you want with the extension .tex (e.g., poem.tex), and click the Save button. The dialog will close, and now you should see the correct name at the top of the edit window. Keep saving the file periodically until the file is complete.
The next step is to process the tex file. Instead of TeXShop, we are going to use the command xelatex. If TeXShop was installed with the full installation option, then xelatex must be in your system. To check this, open a Console window, and in it type the command
which xelatex
If xelatex has been properly installed, the system will type back something like
/use/texbin/xelatex
Now use the cd command to change the directory to where your poem.tex file is, say texstuff in your Home folder, by doing something like
cd ~/texstuff/poem.tex
Then type the command
xelatex poem.tex
The system will produce a lot of messages, and if the processing was completed without any errors, then the messages like the following should appear at the end:
Output written on poem.pdf (1 page)
Transcript written on poem.log
The poetry typesetting sometimes needs the xelatex command to be re-executed. This is indicated in the Console messages, but the Console output may be too long to search for this particular message. There is no harm in repeating the xelatex command anyway, so after its first successful operation just type again:
xelatex poem.tex
The poem.pdf file should show up in the same folder where the tex file is. Double click on it so it will be displayed by Acrobat Reader.
CAUTION: The above tex file needs the package called bidipoem (developed by Vafa Khalighi. This package might be too recent to have been included in your TeXShop installation. If you get the error that the system cannot find bidipoem.sty, then you should download that file from this link bidipoem.sty. See the documentation of TeXShop about where to store such packages that are not included in the MacTeX installation. Of course, you can also save the downloaded file in the same directory where your tex files are, e.g. texstuff in your Home directory.


Modern web browsers are quite good at interpreting and displaying multi-lingual texts from their Unicode character encodings. Of course, the browser needs to be told that it should expect Unicode material in the web document (usually, an html file) that it is being asked to execute. The Unicode character encoding for Urdu and Persian letters, along with the letters of many other languages, is called UTF-8. So to display Urdu text, you have to specify in your web document that its character set is given by UTF-8, as explained next.
The particular character set that a web document contains is specified by the meta statement. Near the beginning of your html file you will find some code that looks like this:
   <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
(This is just an example.
Your character set might have a name different from "ISO-8859-1".)
You have to change the character set declaration to "UTF-8", by replacing
the above meta statement by:
   <meta content="text/html; charset=UTF-8" http-equiv="content-type">
Any Unicode inserted after this meta statement will be displayed as the character that the code represents. The Unicode for Urdu and Persian can be found in the Unicode Arabic page. A table which gives the standard Unicode as well as its html representation, called html numeric character reference, is given here. A very useful online tool is UTF Converter that lets you quickly convert a string of one or more characters to Unicode in various formats. UTF Converter's author, Mark Davis, has a Web site Macchiato with several other very useful Unicode-related utilities.
From the table on page 2 of Unicode Arabic page, you can check that the hexadecimal Unicode representations of the Urdu letters Alif, Re, Daal, and Vaao are, respectively, 0627, 0631, 062F, and 0648. Now the html syntax for a hexadecimal code HHHH is &#xHHHH; . So suppose in your html document you insert the following:
   <center>
   <big><big><big>
   &#x0627;&#x0631;&#x062F;&#x0648;
   </big></big></big>
   </center>
The result will be the word "Urdu" (in Urdu) displayed in 3-size larger letters and centered in a line, as follows:
Typing numerical codes in this way is clearly impractical except for displaying just a few characters. Fortunately, you don't have to enter character codes manually if you use the UrduPhonetic keyboard layout. The characters typed on this keyboard are automatically converted to their Unicode version and placed in the input. All you have to do is to switch to UrduPhonetic on the keyboard menu at the point in your html file where you desire to insert Urdu text.
A caveat is in order here. To prepare html files, you are likely to use some special editor different from TextEdit. We have seen that, in RichText mode, TextEdit processes Urdu letters correctly, displaying the right form of the letter and connecting the letters appropritaely. Other editors, specially the so-called programmer's editors often used to prepare html files, may not do all that. For example, your typed Urdu letters might be displayed in their isolated form from left to right in the order of their entry, without being connected together. Or worse, your typed input might appear garbled in even more annoying ways! If you are looking for a free html editor that handles Unicode and UTF-8 well, and displays Urdu text correctly, try Komodo Edit.
Of course, the readers of your Web page will be able to see the Urdu text correctly only if their system has been configured for multi-lingual processing and has the Urdu fonts installed. In addition, it might be necessary for your readers to set the viewing option of their web browser for "Unicode (UTF-8)" character encoding.
Most of the reported installation difficulties turned out to have a simple reason: during download or extraction, the file extensions got changed. Often a .txt extension was appended to one or more file names.
So first please make sure that your Mac shows extensions in file names. For this, move into Finder (for example, by clicking in a Finder window, or on the Finder icon in the Dock, or at a point on the screen which is not occupied by an application window). Then on the Menu bar (the one with the Apple icon at the left), click on Finder, then on Preferences, then on the Advanced tab. Now look at the Show all file extensions item. If the check box on its left does not have a check mark, then click on it so that a check mark appears there. Finally, close the Advanced window.
Now you can check whether the extensions of the UrduPhonetic files are correct. The downloaded file (UrduPhonetic.zip) and the files that your unzipper extracts (UrduPhonetic.keylayout and UrduPhonetic.icns) should have exactly those names. Change their extensions if necessary, ignoring the Finder's complaint that this could render your files dysfunctional.
Another problem some people have encountered is that during editing Urdu letters show up isolated rather than connected together in the normal way. This can happen when the editor being used is different from TextEdit or Bean. For example, at present Microsoft Word does not handle the Naskh and Nastaleeq scripts correctly on the Mac. Even in TextEdit, sometimes Urdu letters appear isolated rather than correctly connected. This is usually due to TextEdit being run in the plain text mode rather than the rich text mode which Urdu editing requires.
To fix this problem, start TextEdit, and on the Menu bar click on TextEdit, then on Preferences, and then on the New Document tab. If the Rich Text radio button is not active, click on it. Now close Preferences, and quit TextEdit. When you restart TextEdit, it will use Rich Text as the default for new documents.
A related problem that has troubled some people is that in their Urdu files some letters don't seem to have correct shapes. For example, the letters "ChoTi Hay" or "yay" don't connect to the preceding or following letters properly. The culprit in such cases is nearly always the font used. At present only Scheherazade, Lateef, and Geeza Pro among Naskh fonts, and Nafees and Jameel Noori among Nastaleeq fonts are known to work correctly. Please let me know if you discover (or design) other well-behaved fonts for Urdu.
In Urdu, short vowels (e`raab) are denoted by diacritical marks that are placed above, below, or to the left of the letter involved. Although usually omitted, they are occasionally needed to remove ambiguity or to show the correct pronunciation of a word. In particular, the tashdeed and madd signs and the zer of izaafat combinations are always helpful to the reader of the text.
While composing text, you should type such a mark after typing the letter to which it belongs. The most frequently used marks are: zabar (shift->), zer (shift-<), pesh (shift-P), tashdeed (shift-_), and madd (shift-+). Alif with madd can be typed directly as shift-A. The "jazm" mark (shift-Q), which should print like a tiny "daal", doesn't have that shape in existing fonts. The alternative, also unattractive, is the "sukun" mark (/) of Arabic orthography that looks like a little circle.
A complete list of diacriticl marks is given earlier with the keyboard images.
The I and Y keys correspond, respectively, to the maaroof and majhool forms of "yay", popularly referred to as "ChoTi yay" ی and "baRi yay" ے , respectively. (See the note below about maaroof and majhool sounds.) Thus, "galee" گلی (meaning lane) is to be typed as G, L, I, and "taaray" تارے (meaning stars) is to be typed as T, A, R, Y.
The form entered by Y does not connect to the next letter. So even a majhool "yay" letter that occurs in the middle of a word should be typed as I. For example, "bayTay" بیٹے (meaning sons) has to be typed B, I, shift-T, Y. Even though both "yay" letters occuring in this word are pronounced with the majhool sound, the first one has to be entered as I.
In Arabic, the letter "yay" has two dots underneath. In Urdu, the two dots are shown only if "yay" appears at the beginning or in the middle of a word, but not when it is the final letter of a word or when it stands alone (e.g., in an alphabet table). If needed, the "yay" with two dots ي can be typed as option-i.
Noon Ghunna, which appears as the letter Noon but without a dot, is entered as shift-N. Thus "maaN" ماں (mother) is typed as M, A, shift-N. Noon Ghunna adds a nasal quality to the sound of the vowel preceding it.
In the freewheeling, inconsistent way of Urdu orthography, Noon Gunna is used only at the end of a word. In the middle of a word, even where Noon Ghunna would be appropriate, Urdu just uses the ordinary Noon. Examples: "saaNp" سانپ (snake) has to be entered as S, A, N, P; or "pataNg" پتنگ (kite) has to be entered as P, T, N, G. This inconsistency is forced by the circumstance that in the middle of a word, Noon is written as a shosha with a dot above. Without a dot, such a shosha would be visually quite confusing.
In some very old books, specially Urdu instructional primers, Noon Ghunna was indicated by a tiny inverted "v" (circumflex accent) placed above the Noon. This worked both in the middle and at the end of a word. An equivalent sign, ٘ , is still available as option-n although its use in Urdu went out of style decades ago. Note that, by contrast, Hindi takes the rational approach of signifying the nasal modification by always placing a special mark above the affected letter.
The main forms of this letter are
1) independent ء , entered
as shift-4,
2) hamza above "alif" أ ,
entered as the hyphen key (-),
3) hamza in the middle of a word ئ ,
entered as U,
4) hamza above "vaao" ؤ ,
entered as shift-W, and
5) hamza above "ChoTi Hay"
ۂ , entered as the equal key (=).
If hamza is the last letter of a word, use the independent hamza form (shift-4). The words in which this happens are generally derived from Arabic. Examples: "ziaa" ضیاء (meaning light) is entered by typing shift-J, I, A, shift-4; "zakaa" ذكاء (intelligence) is entered by typing shift-Z, K, A, shift-4. This hamza is usually omitted In modern Urdu publications.
Sometimes the words with a terminal hamza are part of a word combination such as "ziaa uddin" ضیأالدّین . It is then traditional to use the "hamza above alif" form. The above combination is entered as shift-J, I, -, A, L, D, shift-_, N. (This L is written but is slient, and the "daal" is pronounced with a tashdeed.)
It is important to understand that "hamza above alif" أ means two different things in Urdu and Arabic orthographies. In Urdu, it is a compactly written combination of two letters, the vowel alif followed by the consonant hamza. In Arabic, it stands for the consonant hamza alone (operated with the short vowel zabar), and is equivalent to Urdu's alif with zabar اَ .
If the letter hamza occurs in the middle of a word, use the key U for it. When typed, it is displayed as a hamza over the letter "yay" ئ. But as soon as the next letter is typed, the yay disappears, and the correct combination of hamza and the next letter is displayed. Examples: "ghaael" گھائل (wounded) entered by typing G, H, A, U, L; "chaae" چائے (tea) entered by typing C, A, U, Y; "na-i" نئی (new) entered by typing N, U, I.
However, even in the middle of a word if a hamza precedes a "vaao", and this pair starts an isolated subword, then the two should be typed together as the single "hamza above vaao" key (shift-W). (To start an isolated subword, this pair should come after an alif, vaao, daal, ray, etc.) Example: "gaaoN" گاؤں (village) should be entered by typing G, A, shift-W, shift-N, and not G, A, U, W, shift-N which would result in the wrong shape گائوں !
The isolated subword condition is important. Otherwise just a medial hamza form (key U) is to be used. Example: "gau maataa" گئو ماتا (Mother Cow) should be entered by typing G, U, W, space, M, A, T, A; Typing G, shift-W, space, M, A, T, A would result in the wrong shape گؤ ماتا !
The "hamza above ChoTi Hay" ۂ occurs in "izaafat" combinations derived from Persian, and it is helpful to add a "zer" sign below it. Examples: "sitaara-e shaam" ستارۂِ شام (evening star) should be entered by typing S, T, A, R, =, shift-<, space, X, A, M. Or, "naala-e dil" نالۂِ دِل (heart's cry) should be entered as N, A, L, =, shift-<, space, D, (optionally, shift-<), L.
The form of the letter "ChoTi Hay" with a hamza above can occur only in the terminal and isolated positions of a word, while the form without a hamza can occur in all positions---initial, medial, terminal or isolated. One should be careful in choosing the correct form of "ChoTi Hay" in "izaafat" combinations. The form without hamza should be used when the "ChoTi Hay" ending a word is pronounced as H, as in "tah" تہ (layer or bottom). The form with a hamza above should be used when the ChoTi Hay ending a word is pronounced as A or E, as in "gila" گلہ (complaint). This point is taken up again in the next subsection.
"BaRi Hay" ح (humorously called "Halvay Vaali Hay") is entered by typing shift-H. Thus "muhabbat" محبّت (love) is entered by typing M, shift-H, B, (optionally shift-_ for tashdeed), T.
"Dochashmi Hay" ھ is entered by typing unshifted H. In modern Urdu orthography, this letter is used only in combination with some consonant (which precedes it), and its purpose is to modify that consonant's sound to make it an "aspirated letter".
"ChoTi Hay" ہ , entered by typing the letter O, is pronounced separately by itself rather than being just used to "aspirate" another consonant. For example, the "Hay" sound is pronounced independently in the word "kahaa" كہا (said); so this word is typed with a "ChoTi Hay", as K, O, A. This is in contrast to the word "khaa" كھا (Eat!) where the "Hay" is used to aspirate the "k" sound; so this word is spelled with a "Dochashmi Hay", as K, H, A.
In the word "majhool" مجہول even though "h" follows "j", no aspiration takes place since the two letters belong to different syllables ("maj-hool") and are pronounced independently. This word should therefore be typed as M, J, O, W, L, and not M, J, H, W, L which would appear incorrectly as مجھول ! In general, "Dochashmi Hay" should not be used in any Urdu word that is derived from Arabic or Persian, since these languages do not have aspirated letters. Aspirated letters can occur only in the words of Indic origin.
There is an exception to the rule that "ChoTi Hay" must be pronounced with an "h" sound. At the end of a word, "ChoTi Hay" is pronounced as an A or E, not as H; for example, the word تكیہ typed as T, K, I, O is pronounced as "takya" (pillow).
An exception to that exception occurs sometimes, and the terminal "Hay" is actually pronounced as H, not A or E. For example, the word "shah" شہ (meaning check [of chess]) is typed X, O. The word "shaah" شاہ (meaning king), typed as X, A, O, is another example where a terminal "ChoTi Hay" is pronounced with an "h" sound.
However, the oddities of Urdu orthography do not end here. In the words ending in a pronounced ChoTi Hay which is not isolated but connected to the previous letter, the Hay is often written twice! For example, the word "kah" (meaning say!) is often written as كہہ , entered by K, O, O; or "sah" (meaning bear!, from the verb "sahna") as سہہ , entered by S, O, O; or "faqeeh" (expert of fiq-h [jurisprudence] ) as فقیہہ , entered by F, Q, I, O, O.
The purpose of doubling the ChoTi Hay is ostensibly to avoid its being wrongly pronounced as A or E. For example, without the extra ChoTi Hay the above words "kah" كہہ and "sah" سہہ could be easily confused with the words "ke" كہ (that) and "se" سہ (Persian three), respectively, in which the terminal ChoTi Hay is indeed pronounced as E. But such is clearly not the case with "faqeeh" فقیہہ , where the extra ChoTi Hay actually introduces the hazard of this word being confused with "faqeeha" (female expert of fiq-h). The reason for writing the ChoTi Hay twice in this word seems to be just the whim of the scribe rather than any logical need. In general, you will find that the spelling variation of doubling the ChoTi Hay is practiced unpredictably and rather inconsistently!
The end of an Urdu declarative sentence is marked with a small dash rather than a period. But the period key itself generates the dash in the UrduPhonetic keyboard. Other punctuation symbols such as question mark, exclamation, comma, semicolon, parentheses, brackets, braces, double and single quotation marks, etc., are entered with the usual keys. Punctuation symbols are appropriately reversed or inverted to match the right-to-left flow of text.
The UrduPhonetic keyboard provides two sets of digits 0, 1, 2, .... The plain digit keys generate the so-called Eastern Arabic digit forms, meant to be used in Persian, Urdu, etc. They are the correct digit forms for use with Nastaleeq fonts. The digit keys with option pressed generate the standard Arabic digit forms, traditionally used in Arabic. Unfortunately, in neither case do the Naskh fonts available on the Mac produce the forms preferred in Urdu publications. The standard Arabic digits seem to match the Naskh script a bit better, so in the following example we use digit keys with the option key.
The decimal point sign "٫" and the thousands separator
"٬" are, respectively, typed as option-period and option-comma.
Examples: One million is typed as option-1, option-comma, option-0, option-0, option-0, option-comma, option-0, option-0, option-0, and is displayed as ١٬٠٠٠٬٠٠٠ . The number 3.1416 is typed as option-3, option-period, option-1, option-4, option-1, option-6, and is displayed as ٣٫١٤١٦ .
It is traditional in writing dates to insert a "date separator" or "small slash" symbol ؍ (shift-3) between the day number and month word, or between the numbers designating day, month, and year.
Example: August 14, 1947 is typed as option-1, option-4, shift-3, A, G, S, T, space, option-1, option-9, option-4, option-7, shift-4, and is displayed as ١٤؍اگست ١٩٤٧ء . Alternatively, this date can be typed as option-1, option-4, shift-3, option-8, shift-3, option-1, option-9, option-4, option-7, shift-4, and is displayed as ١٤؍٨؍١٩٤٧ء .
Some mathematical symbols are so frequently needed in technical typing that they have become standard in Mac's English keyboards. So the UrduPhonetic keyboard also provides several of these symbols, via option and option-shift keys as usual. Note that the symbols for summation, integration, root, etc., change their orientation to match the right-to-left text direction.
In Urdu mathematical notation, the dots of the dotted letters
are sometimes omitted.
So the UrduPhonetic keyboard provides the dotless forms
ٮ for ب
ڡ for ف
ٯ for ق
Another practice in Urdu mathematical writings is to sometimes use just
the stems of letters, not their full form.
Such symbols can be easily generated by adding the "kashheeda"
character (ـ) to a letter.
For example, the symbol خـ , generated by the key
sequence
shift-K and shift-" , represents the "imaginary"
(in Urdu, خیالی) number i.
NOTE: The keyboard suffices only for the casual typing of a few mathematical symbols in a general document. To prepare documents with elaborate mathematical content, the ideal approach is to use TeX/LaTeX/XeTeX. See the section Typesetting Using TeX, LaTeX, XeTeX.

People accustomed to Nastaleeq word processors will discover that TextEdit uses spaces and other punctuation to separate the words in a document. This is the correct and rational behavior, shared by every non-Nastaleeq word processor in the world. Nastaleeq word processors stand alone in suppressing inter-word spaces. The user, of course, still has to type spaces to signify ends of words, but those spaces are removed and the words follow each other in a continuous stream.
Just imagine reading this English page without spaces between words. Deciphering such a character stream requires, in essence, that you already know what you are trying to learn!
The defenders of the Nastaleeq practice might argue that, unlike the situation with English, the ends of Urdu words are often recognizable. But "often" is not "always". Here is an example: In the sample Urdu text given under Traditional Documents, Editors for Urdu that has been typeset in TextEdit, you can distinguish the words because the beginning and end of each word are clearly displayed. Now take a close look at the first line of the sample: you will find that of the 19 words there, 10 are made up of parts that could themselves be thought of as words. For the same text edited with any popular Nastaleeq word processor, you would be able to skip over those unintended words only because you alreay know the intended words, not because the text display is of any help!
When computer typesetting of Nastaleeq was first introduced for Urdu in the 1980s, inter-word spaces were actually used. The practice of suppressing them is more recent. This unwise retrogression, justified in the name of "tradition and esthetics", is an unnecessary obstacle to anyone trying to learn Urdu. The Nastaleeq script already suffers from too many complexities, obscurities, irregularities, and inconsistencies. It makes no sense to invent more barriers to the accessibility of Urdu. The practice simply prolongs the time it takes students to master the language. It is also hindering the development of optical character recognition and other important electronic processing technologies for Urdu.
Exercise for the reader:
Find out what ghatrabood is,
and enjoy the story.
The Urdu alphabet contains the following additional letters that do not exist in Persian:
"Noon Ghunna" ں , "Hamza" ء , "Dochashmi Hay" ھ , and "BaRi Yay" ے   are letters in a rather weak sense, since no Urdu word can begin with these. (Urdu dictionaries do not dedicate chapters to these as they do to regular letters.)
Urdu and Persian differ markedly in the pronunciation of vowels. For example, in Persian the short vowels zer and pesh have majhool sounds and the long vowels vaao and yay have maaroof sounds. (See the note below about maaroof and majhool sounds.) In Urdu, the same vowels do double duty to represent both maaroof and majhool sounds. These differences do not affect writing unless special marks are used to distinguish maaroof and majhool sounds.
There are minor variations in the placement of "hamza" between Urdu and Persian orthographic styles. But the needed forms in all cases are adequately provided by the keyboard and the fonts that we have recommended.
The standard ("educated person's") pronunciation of consonants is generally identical in Urdu and Persian, and often different from Arabic. Some of the similarities and differences are as follows:
Since we have called our keyboard phonetic, we wanted to relate the pronunciation of the alphabet letters with the keys being used to enter them. The tedious details given above will perhaps help you in remembering the keys. As you can see, it is hard to phonetically map the Urdu or Persian alphabet to a Latin-based keyboard!
There is an old classification of certain vowel sounds as maaroof (literally, well-known) or majhool (literally, unknown or unfamiliar). The difference between these can be illustrated with English words as follows:
UrduPhonetic was designed with the aid of Ukelele, a keyboard layout editor for MacOS. I thank John Brownie, the author of Ukelele, for developing this melodious software and for making it available under a freeware license.
I also wish to thank
Amal Ahmed, Aaron Jakes, Muhammad Javed,
Shebab Javed, Karan Misra, Knut S. Vikor, and Muhammad Yusaf
for reporting problems and for offering suggestions to make this
page more informative and useful.
