Word 2004 issues

[May 22, 2004]

I have found some unexpected and rather intolerable bugs (or features?) related to the implementation of Unicode in Word 2004 (Test Drive) on MacOS 10.3:

  1. Buggy support for keyboard layouts.
  2. Improper input methods for "Unicode characters". The consequences are:
  3. Incompatibility with Word for Windows.
  4. Incompatibility with other Unicode-savvy Mac applications: copy and paste.
  5. Incompatibility with other Unicode-savvy Mac applications: save and load.

1) Buggy support for keyboard layouts.

This "feature" relates to the use of Option key combinations.

In MacOS X, it is generally possible to let a Roman keyboard layout (group 0) produce non-MacRoman output, as if it were a Unicode keyboard layout (group 126), and it can of course be used in Unicode-savvy applications, such as TextEdit, Mellel and Nisus Writer Express. A Roman keyboard layout can also be used in applications that are not Unicode-savvy, and the additional characters will simply not be displayed. This is according to TN2056.

One of the advantages of using such a "group 0" version of a Unicode-capable keyboard layout is that you can use it as your default keyboad layout, and you don't have to reselect it every single time you launch a Unicode-savvy application where you want to type non-MacRoman characters.

Now, Word 2004 is Unicode-savvy, and it does of course support both Unicode keyboard layouts (group 126) and Roman keyboard layouts (group 0) with extended capabilites. So in my favourite extended Roman keyboard layout I can press a dead key (e.g. acute accent) + a letter (e.g. p), and get "p with acute accent".

However, if the keyboard layout is identified as Roman (group 0), it is not possible in Word 2004 (as opposed to e.g. TextEdit, Mellel and Nisus Writer Express) to type non-MacRoman characters that have been placed on an Option key combination. This means that the letter "edh" (placed on Option+d in my layout) will not appear. On the other hand, Option+d still produces "edh" if the same keyboard layout is identified as a Unicode keyboard layout (group 126), as it is supposed to.

2) Improper input methods for "Unicode characters".

If we want to write e.g. Greek characters in TextEdit or any other Unicode-savvy application for MacOS X, we have to choose the Greek keyboard layout to access the Greek Unicode block. This differs from the method used in Windows, where we change the font (typically we would use the Symbol font), and continue to use a non-Greek keyboard layout (e.g. Norwegian or US).

Word 2004 allows us to write Greek letters using the Symbol font together with a Roman (non-Unicode) keyboard layout. (It is also possible to mark a piece of text written with the Times font, change the font to Symbol, and get Greek letters.) For example, we can choose the Symbol font and press the "a" key using the US keyboard layout. On the screen we will then see a Greek alpha.

This should not really be possible. When we have chosen a Roman _or_ Unicode keyboard layout for Roman letters (e.g. the Norwegian, US or USA Extended keyboard layouts), and press a key that is supposed to output Unicode character #x0061 (i.e. the letter "a"), then we should never expect to get a Greek alpha. We should expect to get exactly character #x0061 (the letter "a"), whether we use a Roman _or_ a Unicode keyboad layout. If we want to write a Greek alpha, then we should select a keyboard layout that defines the output of e.g. the "a" key to be #x03B1, i.e. "alpha". Such a keyboard layout is the already mentioned Greek Unicode keyboard layout.

So, Word 2004 allows us to write Greek letters with Roman keyboard layouts (such as US) in combination with the Symbol font. But if we use a Unicode keyboard layout (such as US Extended), the "a" key will produce a Roman "a", not a Greek alpha. The two kinds of keyboard layouts should behave in the same way, they should not allow us to type Greek letters (unless Greek letters are included in the keyboard layouts), and it should not be possible to convert Roman letters into Greek letters simply by changing the font.

These deviations from standard input methods causes incompatibilities with both Word for Windows and with other Mac applications. This is discussed below.

3) Incompatibility with Word for Windows.

Documents that have been written with the "wrong" input method mentioned above (e.g. "alpha" written with a Roman keyboard layout and the Symbol font), are incompatible with Word for Windows (tested Word 2000). In a test document I wrote Greek letters using three different methods:

a) Greek Unicode keyboard layout + Symbol font
b) Greek Unicode keyboard layout + Times New Roman
c) Norwegian Roman keyboard layout + Symbol font

Methods a and b are in agreement with how keyboards in MacOS X are expected to work. Method c is allowed by Microsoft Word 2004, but is, as described above, highly irregular.

The Compatibility Report of Word 2004 claimed that the ".doc" document would be compatible with Word 2000 for Windows. However, Word 2000 for Windows did not agree, and refused to even open the test document.

When I deleted the text written with method c, the problem disappeared, and Word 2000 for Windows opened the document perfectly.

4) Incompatibility with other Unicode-savvy Mac applications: copy and paste.

I made a new test document, where I tried the following methods for writing Greek letters:

a) Greek Unicode keyboard layout + Symbol font
b) Greek Unicode keyboard layout + Times New Roman
c) Norwegian Roman keyboard layout + Symbol font
d) Norwegian Unicode keyboard layout + Symbol font
e) Norwegian Roman keyboard layout + Times New Roman, subsequently converted to the Symbol font
f) Norwegian Unicode keyboard layout + Times New Roman, subsequently converted to the Symbol font

Text d refused to be written with Greek letters, as expected. This text is therefore irrelevant for the rest of the discussion.

I marked texts a-f, copied them, and pasted them into TextEdit.

Text a was unexpectedly rendered as Roman (not Greek) letters in the "SILDoulos IPA 93" font (a phonetic font that I installed at some time), and not with the Symbol font, even though text a had been typed in Word 2004 with the appropriate and correct method.

Text b had also been typed with the correct method, and this text did indeed show up as expected: Greek letters in Times New Roman.

Text c was rendered just like text a, except that a few letters were different: Where I had used the Q/q-key (yielding : and ; in the Greek original of text a, and uppercase/lowercase theta in the Greek original of text c), this was rendered as superscript 1 and small uppercase L in the copy of text a, and æ/q in text c.

Text e and f were treated alike, and they survived as identical with the originals.

Summary of section 4: Input methods b (proper method), e and f (improper methods) worked great, and the text survived the copying and pasting. Input methods a (proper method) and c (improper method) did not. Or in other words: What is supposed to work, does not always work. What is not supposed to work, may work.

5) Incompatibility with other Unicode-savvy Mac applications: save and load.

This is getting fun.

I saved this last test document as .doc and as .rtf, and then opened the two files in TextEdit.

When the rtf-version was opened, it looked exactly like the copy-and-paste version discussed above.

The doc-version did not:

Text a, b (proper input methods) and c (improper method) survived unchanged.

Text e and f (improper methods) showed up as Times New Roman with non-Greek letters.

Summary of section 5: Sometimes things work, other times they don't, and it is difficult to see any motivated pattern.

All in all:

Word 2004 for MacOS X is to a certain degree in conflict with MacOS X, and it is to a certain degree incompatible with Word versions for Windows with which it is claimed to be compatible. I would rather wait for Nisus Writer Express 2.0 than buy Office 2004.

I have not tested the other Office 2004 applications in such detail, but I have noted that Excel seems to have the same problems regarding the Option key.


Jardar Eggesbø Abrahamsen
<jardar at nvg.ntnu.no>


PS