[Tutorial] Differences between Writer and MS Word files

Forum rules
No question in this forum please
For any question related to a topic, create a new thread in the relevant section.

[Tutorial] Differences between Writer and MS Word files

Postby John_Ha » Mon Jun 06, 2016 4:55 pm

AOO Writer and MS Word (and other word processing programs) are all similar, but none are identical. If you open a Writer .odt file in Word, or a Word .doc, .rtf or .docx file in Writer, you will sometimes notice differences.

Although this tutorial discusses text files it applies to spreadsheet (save as .ods) and presentation (save as .odp) files as well, where the differences between Microsoft and AOO formats are unfortunately even greater. Remember Microsoft has an army of programmers developing the software and adding function to it and neither AOO nor LibreOffice can match that effort. In fact, whenever you use any application, always save in that application's native format to avoid data loss and data corruption.

If you must have 100% compatibility with MS Office then you need to purchase MS Office.
If you can live with 95% plus compatibility then AOO or LO should work for you, especially if you download the free MS Word Viewer from How to obtain the latest Microsoft Word Viewer.

OpenOffice.org Writer for Microsoft Word users: How to perform common tasks is very old but may be of help to those migrating from MS Word to AOO or LO.

Both AOO Writer and MS Word do many of the same things, including text, styles, tables, images, bold, italics, headings, page number, headers, footers etc, as shown in the light blue area below. However, Writer does some things which Word does not do (the red areas); and Word does some things which Writer does not do (the green and dark blue areas). Each program can store its own data in its own file, but obviously cannot store this extra data in the other program's file as there is nowhere for it to go and/or nothing in the other program to see it.

writer and word.png
Different capabilities of Writer and its .odt files, compared with MS Word and its .doc, .docx (and .rtf) files.
Note that MS Word, while capable of supporting some of the function stored in a .odt file, chooses not to implement that function.
MS Word 95 and MS Word 6.0 files cannot store Draw objects.

Writer and Word are based on different schools of typography which can be slightly confusing. Word considers the page header/footer areas to be part of "print matter" while Writer considers them to be "marginalia". You may need to change the top and/or bottom margin widths by the height of 'one line + header/footer spacing' if you have page headers/footers and you are trying to replicate a .doc layout in a .odt file. [Thanks to keme]

Similarly, when you open a .doc. .docx or .rtf file, what you see may not be exactly what the person wrote - formatting in particular is often changed. .rtf files are particularly limited in what they can store. .txt files can store only the text characters - they cannot store any formatting or font information.

0. When using Writer always save all documents as .odt files

When using any application, always save files in that application's format because everything will be saved. When using Writer always save all documents as .odt files.

That way you know that all your document and formatting will be saved. If someone irrationally asks you to send them a .doc file, question the request, and offer to send them a .odt file instead as all versions of Microsoft Office later than 2007 claim to be able both to read and to write .odt files. If MS Word corrupts the .odt file, get the recipient to complain to Microsoft - note the hatched area below where MS Word chooses not to work with that content stored in a .odt file. If the requester insists on a .doc file, then create a .doc file as a copy of the master .odt file, and delete the .doc after sending it so you don't start editing it by mistake in future. If you want to guarantee the recipient sees what you see you have two choices:
1. Send them a .odt file and tell them to open it with AOO Writer. Even this does not guarantee it because if the user does not have the fonts you used installed on their PC, their PC will substitute different fonts.
2. Create a PDF and send them the PDF. This guarantees they will see exactly what you see in your PDF. AOO embeds the fonts in the PDF so even if they don't have those fonts installed, they will use the fonts you embedded. The downside? is they cannot edit a PDF.

Always work in, and save all Writer documents, as .odt files.

Don't forget that Google Docs uses .odt files and Microsoft is now feeling a lot of pressure from the .odt format.

If you save your work as any other file other than a .odt file (eg .doc, .rtf etc) you are almost certain to lose something. In general, it is the more complex things which get lost or mangled, such as Edit > Changes, bullet shapes, colours etc.

 Edit: If you use different Page Styles in a document, and you save the file as a .doc, you will often run into seemingly endless problems with page numbering, headers, footers and page styles. The problems appear to get worse if the .doc file is edited by both AOO and MS Word.

This seems to be because AOO and MS Word handle Page Styles differently. AOO (and possibly LO??) seems to have a problem coping with this difference when using Save as .doc .... The problem does not occur when you Save as .odt ....

If you have a Page Style named Convert 1 (or 2, 3 ... etc) in your .doc file, you probably have the problem - ask for advice on the forum. 

Be very careful with .rtf files

Note how a .rtf file cannot store some of Writer's capability. To make matters worse, Writer has, for example, chosen not to write notes to an .rtf file even though the .rtf file does allow notes to be saved in it. This is an example of an application (Writer) having the capability (notes), but choosing not to provide it in a given format (.rtf). Writer will, of course, save notes in a .odt file so, save as a .odt file and then create a copy as a .rtf file. If anything gets lost in the .rtf file you can go back to the .odt file where it will be saved.

OpenOffice Migration Guide

See the OpenOffice Migration Guide for more information.

1. Textboxes in .docx files do not display and nor does their content

Later versions of MS Word which write .docx files often use Textboxes. Textboxes are not part of the OOXML International Standard - they are a Microsoft add-on which is proprietary. See OOXML/Markup Compatibility and Extensibility which says

Although the OOXML spec defines a specific set of allowed elements, Microsoft sometimes extend this with additional proprietary elements that are specific to new versions of Office. For example, if you insert a shape into a document in Word 2013, it will be defined in terms of a "word processing shape" element structure, which is not part of the OOXML spec. For the purposes of compatibility with older versions of Word however, they include a second version of the shape which uses an element structure that is defined in the spec, albeit using the legacy VML drawing format.

AOO Writer (4.1.3) only recognises the OOXML Standard parts of the file - anything which does not comply with the OOXML Standard is ignored so Textboxes are ignored. It appears that LibreOffice Writer does recognise Textboxes.

2. Bullets, list items and numbered items in .doc files often display incorrectly

Bullets, list items and numbered items in MS Word .doc files often display incorrectly when the file is opened with Writer and the corruption persists when the file is saved as a .odt file. Typical corruptions are the bullet appearing with a digit inside (10 is common), or the list number [eg a) or b) ] being struck through or highlighted in colour.

The bullet appearing with a digit inside (10 is common) is almost always a font substitution problem, not an MS Word problem, arising when the OpenSymbol font cannot be found. See Bullets were working fine, why did someone mess that?

The list item being incorrect problem is usually caused by MS Word specific Character Styles, typically with names like WW8Num1z0, WW8Numz2 ..., etc, which are applied to Bullets, Lists and Numbering. Deleting these MS Word Character Styles (or editing them to be consistent with what is available in Writer) fixes the problem. What actually happens is the MS Word Character Style, which is defined in the Styles and Formatting dialogue under Character Styles is applied to the Bullets, Lists and Numbered Items by the Format > Bullets and Numbering ..., dialogue, where it appears under the Option tab as the selected Character Style. Set it to Numbering Symbols which is the default setting for AOO bullets. If you set it to None the bullets pick up the font etc characteristics from the text and not from the List Styles. See Oddities Involving Bullets/Outlines & Font Styles/Highlights

Either delete these unwanted Character Styles by

1 press F11 to open the Styles and Formatting window
2 click Character Styles - second icon
3 right click the character styles with names beginning WW8 > delete

This fixes it throughout the entire document.

Or, fix just one occurrence by resetting it to use the Writer defaults

1 place the cursor in a bulleted line and go Format > Bullets and Numbering
2 choose the Options tab
3 Character Style will be something like WW8Num1z0. Set it to Numbering Symbols (or None as appropriate)

3. Documents layout differently - lines, paragraphs and pages spill in different places

This is not an MS Word / OpenOffice problem - it is more a "Microsoft Windows lockin" problem.

It is in Microsoft's commercial interest to keep on changing fonts and/or add new fonts to Windows and to encourage Windows users to use these new fonts. When documents with these new fonts are sent to users using other operating systems, or even older versions of Windows, which do not have the fonts installed, the documents will invariably change format - lines, paragraphs and pages spill in different places.

The only way to ensure the layout does not change is to do what PDFs do, namely embed the fonts in the PDF file itself. While AOO embeds fonts in PDF files it creates, it does not embed fonts in .odt or .doc etc files. Hence you need to install the fonts on the new PC if the document is to appear identical.

Remember that the font showing in the Writer font drop-down selection box is the font the document is asking for. This may NOT be the font being used to create the display because, if the font being asked for is not installed on the PC, Windows (or other operating system) will silently substitute a different font which is available.

The TestFonts add-on is invaluable for finding missing fonts which the document is asking for, but which are not installed on the PC.

You can check which font is being used to display any given text by highlighting that text and going Format > Character > Font ... If the font is missing the text will say "This font has not been installed. The closest available font will be used."

You can see which fonts are installed on the PC by Start > Control Panel > Fonts ..., or by clicking C:\Windows\Fonts. Mac PCs seem to have multiple locations for font files???

4. Saving as .doc files is not recommended but ...

... if you are forced to create a .doc file, save as a .odt as usual, and create a copy as a .doc file. Be sure to select Word 97 / 2000 / XP as it is the most recent format. Word 95 and Word 6.0 .doc formats are very old and obsolete and less comprehensive than Word 97 / 2000 / XP .doc format. For example, Word 95 and Word 6.0 file format cannot store Draw objects.

Types of .doc files.png
Use Word 97 / 2000 / XP - Word 95 and Word 6.0 are very old and obsolete
Types of .doc files.png (59.41 KiB) Viewed 1962 times

If you attempt to save a document as any format other than .odt, Writer warns you that you may lose data and / or formatting as in the pop-up window below. Unfortunately, many users switch off this warning :crazy: If you do not get this warning message, you can switch it back on with Tools > Options > LoadSave > General ...

Save As doc file.png
Warning message given when you save as anything which is NOT .odt.

DO NOT SWITCH THIS WARNING OFF!

5. Microsoft Word Viewer

If you regularly receive .doc or .docx files, you will find it very useful to download the free Microsoft Word Viewer from How to obtain the latest Microsoft Word Viewer. You can then open the .doc or .docx file, and check to see if any content is missing and, if necessary, copy the content into Writer.

6. MS Word can read and write .odt files

All versions of MS Word later that Word 2007 claim to be able both to read and write .odt files and Microsoft lists its partial support of .odt files in Differences between the OpenDocument Text (.odt) format and the Word (.docx) format. So, if someone sends you a .doc or .docx file you cannot read, ask them to send you a .odt file instead. If MS Word does not create a proper .odt file, ask the sender to complain vigorously to Microsoft. Similarly, if you send someone who uses MS Word a .odt file, and MS Word does not present it correctly, ask the person who received it to complain vigorously to Microsoft.

Note that AOO has some Microsoft compatibility options available under Tools > Options > Load/Save > VBA Properties..., and Tools > Options > Load/Save > Microsoft Office ..., which may need changing.

7. Academic study of Interoperability Issues

For an academic study of the problems see the University of Illinois' paper Lost in Translation: Interoperability Issues for Open Standards written in 2008.

I did not think that the paper covered very well the fact that the key benefit of an Open Standard is that ...

... it provides the all information necessary so that anyone can extract all the information from the data file without needing to have the application. This is because the file structure is not a commercial secret
.

Similarly, I felt the paper only briefly mentioned that applications must support all the "items" coded in the file - see the diagram on this page. Interoperability only exists across those functions implemented in both programs and those functions which are implemented in file format being used to store the document ie the light blue items for Writer, MS Word, .odt and .doc files.

Further information on the history of the .doc format can be found in the wiki article Doc (computing) which includes:

Specification

Because the DOC file format was a closed specification for many years, inconsistent handling of the format persists and may cause some loss of formatting information when handling the same file with multiple word processing programs. Some specifications for Microsoft Office 97 binary file formats were published in 1997 under a restrictive license, but these specifications were removed from online download in 1999. Specifications of later versions of Microsoft Office binary file formats were not publicly available.

The DOC format specification was available from Microsoft on request since 2006 under restrictive RAND-Z terms until February 2008. Sun Microsystems and OpenOffice.org reverse engineered the file format. On February 15, 2008, Microsoft released a .DOC format specification under the Microsoft Open Specification Promise. However, this specification does not describe all of the features used by DOC format and reverse engineered work remains necessary.

Since 2008 the specification has been updated several times; the last change was made in September 2015.


8. Microsoft’s OOXML "pseudo-standard" format (.docx etc)

See Why you should never use Microsoft’s OOXML pseudo-standard format where Italo Vignoli of The Document Foundation, the organization responsible for developing LibreOffice, talks about "the dirty tricks Microsoft uses to break interoperability and keep users locked into their platform". It includes
... each version of MS Office since 2007 has a different and non standard implementation of OOXML, which is defined as “transitional” because it contains elements which are supposed to be deprecated at standard level, but are still there for compatibility reasons. Although LibreOffice manages to read and write OOXML in a fairly appropriate way, it will be impossible to achieve a perfect interoperability because of these different non standard versions.

In addition to format incompatibilities, Microsoft – with OOXML – has introduced elements which may lead the user into producing a non interoperable document, such as the C-Fonts (for instance, Calibri and Cambria).

See MS Office 2007 OOXML file format (docx, xslx, pptx, ppsx) for a discussion of OOXML and why many consider OOXML is a deliberate attempt by Microsoft to make it almost impossible for other vendors to read or write fully compliant OOXML files. The "standard" is 6,000 pages long and it is estimated a full import or export filter would take 50 to 500 person-years to write.

And after you have done all that work, all it takes is for Microsoft to make another not-part-of-the-standard change or addition to the so called "standard" ... and your filter no longer works. :crazy:

9. By default .docx files do not comply with the OOXML standard

See Complex singularity versus openness for a discussion of the impossible position in which vendors find themselves because Microsoft default .docx files do not comply with the OOXML standard. What hope is there if Microsoft doesn't even bother to use the standard it professes to use?

A default installation of MS Word uses the "transitional" OOXML "standard" which does not comply. It is possible for users to configure MS Word to use the Strict OOXML Standard, which is fully compliant, but very, very, very few do, and even fewer have even heard of it! You might even conclude that it is in Microsoft's commercial interest - it's all about money - for users to use the "transitional" "standard" because it makes exchange between MS Word and other vendors more complex, and users might be forced into buying MS Word.

10. Exchanging documents for proof reading between AOO and MS Word

It is always best for both people to use the identical application if a document is to be edited by both. However that is often not possible and if one uses AOO and the other uses MS Word, there will probably be problems.

First, whatever you decide, if you are using OpenOffice always save your work as a .odt file and consider it to be the master document.

a) If the document is reasonably simple with little complexity or formatting, then send the .odt file and ask the other person to open the .odt file with MS Word, and to save it as a .odt file.

The success depends on how good MS Word is when working with .odt files - see Use Word to open or save a document in the OpenDocument Text (.odt) format. I cannot improve on Microsoft's excellent tip:
When you collaborate on a document shared between Word and another word processing application, such as Google Docs or OpenOffice.org Writer, think of writing (the words) and formatting (the look) as different tasks. Complete as much of the writing as possible without applying formatting to the text and save the formatting until the end. This allows you to focus on the writing while minimizing the loss of formatting as you switch between the OpenDocument Text format and Word format.

We have seen a number of forum posts where Edit > Record changes ..., has been used, and the file is saved (by LibreOffice? by MS Word?) as a .docx file. The .docx file gets badly corrupted. Also, the person using MS Word should not highlight a range of characters and attach a comment to them as this is known to cause file corruption. Comments attached to a location are fine.

Also see Differences between the OpenDocument Text (.odt) format and the Word (.docx) format which lists what MS Word supports, partially supports and does not support in .odt/.docx files.

b) The next safest method is probably to create a copy of your master .odt file as a .doc file and send the .doc file to the other person.

c) The only 100% certain method is for both you and the other person to use the identical software.

11. AOO Help has a section About Converting Microsoft Office Documents ...

... which discusses the? some? differences.
About Converting Microsoft Office Documents

OpenOffice can automatically open Microsoft Office 97/2000/XP .doc document files. However, some layout features and formatting attributes in more complex Microsoft Office documents are handled differently in OpenOffice or are unsupported. As a result, converted files require some degree of manual reformatting. The amount of reformatting that can be expected is proportional to the complexity of the structure and formatting of the source document. OpenOffice cannot run Visual Basic Scripts, but can load them for you to analyse.

The most recent versions of OpenOffice can load, but not save, the Microsoft Office Open XML document formats with the extensions .docx, .xlsx, and .pptx. The same versions can also run some Microsoft Excel Visual Basic scripts, if you enable this feature at Tools - Options - Load/Save - VBA Properties.

The following lists provide a general overview of Microsoft Office features that may cause conversion challenges. These will not affect your ability to use or work with the content of the document once the MS file has been saved as a .odt etc file.

Microsoft Word
1. AutoShapes
2. Revision marks
3. OLE objects
4. Certain controls and Microsoft Office form fields
5. Indexes
6. Tables, frames and multi-column formatting
7. Hyperlinks and bookmarks
8. Microsoft WordArt graphics
9. Animated characters/text

Microsoft PowerPoint
1. AutoShapes
2. Tab, line and paragraph spacing
3. Master background graphics
4. Grouped objects
5. Certain multimedia effects

Microsoft Excel
1. AutoShapes
2. OLE objects
3. Certain controls and Microsoft Office form fields
4. Pivot tables
5. New chart types
6. Conditional formatting
7. Some functions/formulae (see below)

One example of differences between Calc and Microsoft Excel is the handling of boolean values. Enter TRUE to cells A1 and A2.
In Calc, the formula =A1+A2 returns the value 2, and the formula =SUM(A1;A2) returns 2.
In Excel, the formula =A1+A2 returns 2, but the formula =SUM(A1,A2) returns 0.

For a detailed overview about converting documents to and from Microsoft Office format, see the OpenOffice Migration Guide.

Opening Microsoft Office Documents That Are Protected With a Password

OpenOffice can open the following Microsoft Office document types that are protected by a password.

Note: If you cannot open an encrypted file, ask someone with MS Word to open it for you, and save it without the password.

Code: Select all   Expand viewCollapse view
Microsoft Office format                                 Supported encryption method

Word 6.0, Word 95                                       Weak XOR encryption

Word 97, Word 2000, Word XP, Word 2003                  Office 97/2000 compatible encryption

Word XP, Word 2003                                      Weak XOR encryption from older Word versions

Excel 2.1, Excel 3.0, Excel 4.0, Excel 5.0, Excel 95    Weak XOR encryption

Excel 97, Excel 2000, Excel XP, Excel 2003              Office 97/2000 compatible encryption

Excel XP, Excel 2003                                    Weak XOR encryption from older Excel versions


Starting from OpenOffice.org 3.2 or StarOffice 9.2, Microsoft Office files that are encrypted by AES128 can be opened. Other encryption methods are not supported.


Disclaimer: Everything in this post is opinion. Please let me know of any errors so they can be corrected.
Last edited by John_Ha on Fri Jul 13, 2018 1:15 pm, edited 3 times in total.
AOO 4.1.5, Windows 7 Home 64 bit

See the Writer Manual, the Writer FAQ, the Writer Tutorials and the Writer guide.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
 
Posts: 5525
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: [Tutorial] Differences between Writer and MS Word files

Postby tampamamba » Tue Jul 10, 2018 1:24 pm

As a new user of the forum, and a ten+ year user of AOO (before it was Apache) I would like to thank you for the attention to detail and overall professionalism of your tutorial!
I was looking for a hint on whether or not there is a compatible open-source voice to text program available because my typing abilities have eroded with my "maturity."
Apache OpenOffice 4.15
Windows 10
Patrick McLaughlin
tampamamba
 
Posts: 1
Joined: Tue Jul 10, 2018 12:57 pm


Return to Writer

Who is online

Users browsing this forum: No registered users and 1 guest