[Solved] LibreOffice .docx file missing text but not images

Help with installation and general system troubleshooting questions concerning the office suite LibreOffice.
Post Reply
HighOnTrombone
Posts: 1
Joined: Mon Apr 20, 2020 10:19 pm

[Solved] LibreOffice .docx file missing text but not images

Post by HighOnTrombone »

Hi all,

I'm working on a document using LibreOffice 6.3.5.2 on Windows 10. I saved it as a .docx (my attempts to find a solution so far have taught me the error of doing that — I'll use .odt in the future), but when I reopened it the next day, it was missing most of the text, starting about at the list of tables.

Googling to find a solution has led me to a few things that sound similar, but aren't quite the same. This doesn't seem to be a problem with SAXParse. This bug thread isn't about quite the same problem; instead of the file just being truncated, the images I put in are still there (as are the cross-references throughout the paper and the hyperlinks in what used to be the bibliography), and while I did try checking the document.xml file, XML Copy Editor says it's well-formed and the XML Tools plugin for Notepad++ detects no errors. Trying to open it in MS Word 2010 didn't work — it just told me it couldn't open the file because of an unspecified error in word\document.xml at line 2, column 0.

The information all seems to still be in document.xml, so it should theoretically be fixable, I'm just not sure how. I've uploaded the file here so people can take a look if they want. Thanks in advance for any help you can give me.
Last edited by HighOnTrombone on Tue May 05, 2020 4:48 pm, edited 1 time in total.
LibreOffice 6.3.5.2 on Windows 10
User avatar
Zizi64
Volunteer
Posts: 11352
Joined: Wed May 26, 2010 7:55 am
Location: Budapest, Hungary

Re: LibreOffice .docx file missing text but not images

Post by Zizi64 »

I saved it as a .docx
Always work in the native, intrernational standard ODF file formats. Save a copy into the foreign file formats at end the editing - if it is necessary.
Tibor Kovacs, Hungary; LO7.5.8 /Win7-10 x64Prof.
PortableApps/winPenPack: LO3.3.0-7.6.2;AOO4.1.14
Please, edit the initial post in the topic: add the word [Solved] at the beginning of the subject line - if your problem has been solved.
John_Ha
Volunteer
Posts: 9583
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: LibreOffice .docx file missing text but not images

Post by John_Ha »

If you unzip the .docx file (rename fred.docx to fred.zip and double click it) you will see \Word\document.xml. It contains the text and XML tags. \Word\Media contains the images.

Open document.xml with Notepad++

If you pretty print document.xml with the XML Add-on it becomes readable.

(Optional: Linearise the XML or you will get loads of tabs in the result.) Go Search > Replace ..., with search argument <[^>]+> and replace argument is blank. Be sure to tick Regular Expressions. Click Replace All. This strips the tags. See text.odt which has the text but all formatting, tables etc has been stripped.

See [Tutorial] Differences between Writer and MS Word files for a description of differences and for why you should always work in, and save Writer files as .odt, Calc files as .ods, Impress files as .odp etc.

Showing that a problem has been solved helps others searching so, if your problem is now solved, please view your first post in this thread and click the Edit button (top right in the post) and add [Solved] in front of the subject.
Attachments
text plus tabs.odt
Text complete with tabs - may be easier to understand
(44.71 KiB) Downloaded 294 times
text.odt
Text with no tabs
(40.58 KiB) Downloaded 280 times
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Post Reply