Missing large part of text due to error SAX

Help with installation and general system troubleshooting questions concerning the office suite LibreOffice.
Post Reply
konstantinos
Posts: 1
Joined: Mon Jul 02, 2018 12:20 pm

Missing large part of text due to error SAX

Post by konstantinos »

Hi all, I have this problem when trying to open my file, unfortunately I realized that I should always save my work in odt alas the damage is done now so I wonder if anyone can help me fix my document a it is for academic publication. Lesson learned...
Libre Office Version: 6.0.3.2. MacOS High Sierra Version: 10.13.5
FJCC
Moderator
Posts: 9248
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: Missing large part of text due to error SAX

Post by FJCC »

Repairing the document requires editing the XML content. There are a tutorial about this here. You can also post the document here or share it via dropbox/google drive/Mediafire or email it to someone. I can send you my email address via private message if you are interested in that.
OpenOffice 4.1 on Windows 10 and Linux Mint
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
John_Ha
Volunteer
Posts: 9583
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Missing large part of text due to error SAX

Post by John_Ha »

Your file is almost certainly repairable and all content will be recovered. For an explanation of SAXParse errors and how to fix them see [Tutorial] How to fix SAXParse errors in LibreOffice files

Alternatively send the file to FJCC - use the email button next to his post.

See the tutorial - if you have opened the broken file, and then saved it with the same name so you overwrote the original file, then all truncated text will have been lost. If you have done so your only hope is to see [Tutorial] How to find and un-delete Writer temporary files for

a) detailed instructions on how to recover your file as it was when you last opened or saved it, or as it was when it was last saved with AutoRecovery;

b) how to find previous versions of the file in the folder it is located in, but which have since been deleted;

c) how to un-delete the temporary files Writer wrote while you were editing the file, and then deleted. This will recover your file as it was when you last opened or you last saved it and is probably your best hope. As it was a .docx file follow the instructions for recovering a .odt file.

See [Tutorial] Differences between Writer and MS Word files for why you should always work in and save files as .odt.
Attachments
Clipboard02.gif
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
FJCC
Moderator
Posts: 9248
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: Missing large part of text due to error SAX

Post by FJCC »

I got a copy of the file from the OP. It opens in OpenOffice without complaint but many of the images are missing. LibreOffice complains as follows
Screen Shot 2018-07-02 at 16.25.53.png
I cannot find the problem in the xml code. Can someone else take a look?
OpenOffice 4.1 on Windows 10 and Linux Mint
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
John_Ha
Volunteer
Posts: 9583
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Missing large part of text due to error SAX

Post by John_Ha »

FJCC

I have sent you a PM with my email ID. Send me the file and I will look at it.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
Posts: 9583
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Missing large part of text due to error SAX

Post by John_Ha »

I have inspected the file.

The file is a .docx file and uses MS Word "not part of the OOXML Standard" text boxes. AOO does not support Textboxes (LO does) so textboxes and items within them do not display in AOO.

The file opens with OpenOffice and displays 23 pages where the page 23 is as below. There is substantial content below Page 23 in the XML file. "triangle" is at line 6,745 in NotePad ++ (pretty printed).
last page.gif
There are numerous <mc:AlternateContent> tags which are Microsoft's way of saying "The following content is not part of the OOXML standard". AOO does not understand anything between <mc:AlternateContent> tags. There are several occurrences before line 6,745 (the first occurs at line 4,129) so some content is not being displayed by AOO in the 23 pages I can see.

When I check the XML syntax NotePad++ gives the following error which I do not understand. The error "seems to be about where I would expect it" which is after the last thing I can see.
error.gif
error.gif (7.52 KiB) Viewed 2827 times
I think therefore there are two options.

1. I remove all the XML tags from document.XML. This will leave just the text content - at least the text will be saved.

2. I have posted the code causing the problem below to see if anyone can identify the error in the XML. The tags look a valid pair. If that error can be fixed then the .docx file should open properly, with all content visible, in LO. Unfortunately I cannot find a similar set of tags to use as a template to make a correction. I am fairly certain that the " pic " is incorrect in both tags but I do not know what to replace it with.
Attachments
picbody error.gif
Last edited by John_Ha on Mon Jul 02, 2018 9:36 pm, edited 1 time in total.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
Posts: 9583
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Missing large part of text due to error SAX

Post by John_Ha »

The text only is attached.
Attachments
text only.odt
(36.93 KiB) Downloaded 210 times
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Post Reply