Page 1 of 1

[Solved] Corrupted file (SAXParseException error)

PostPosted: Sun Jan 20, 2019 10:30 pm
by a1ro
Good day. I can not open the file in LibreOffice. It gives an error with the following content:
SAXParseException: "No namespace defined for pic"
SAXParseException: '{word / document.xml line 20}: Namespace prefix pic on bodyPr is not defined
Stream 'word / document.xml', Line 20, Colum 23379
Please help restore the document. Writing language: Russian.

My file

Re: Corrupted file (SAXParseException error)

PostPosted: Mon Jan 21, 2019 12:16 am
by Lupp
You posted the question in the LibreOffice branch, but your signature tells you are using V 3.1.
This is inconsistent. In addition OOo V 3.1 is very old. I don't think it ever claimed to be able to work with docx.

The document you attached is in MS(TM)-docx representation. Was it created with MS-Office? If so an MS forum might be the correct place for the thread. If not: Always use the internationally approved odf formats to save LibreOffice/Apache OpenOffice/ documents.

Did you already check the result you get if you ignore the error?

What I got with LibO V ignoring the error you find here: ... gnored.odt. There may a graphical object be missing.

I don't know enough about docx to help you beyond this. Trying to locate the problem in the wrapped-in document.xml I failed due to my lack of knowledge.

Re: Corrupted file (SAXParseException error)

PostPosted: Mon Jan 21, 2019 12:19 am
by RoryOF
Lupp's result (linked above) is very close to what I got using a different method to Lupp. I doubt you will get anything better.

Re: Corrupted file (SAXParseException error)

PostPosted: Mon Jan 21, 2019 3:08 am
by John_Ha
Having examined the file it appears this is the normal type of error where all the user content before the error is displayed OK. However, all the user content following the error, even well formed content, is not displayed even though it is still present in the document.xml file.

I think the error has been caused by LO when writing a .docx file.

The last word displayed in the corrupted file is "Схема:"

When I pretty printed document.xml I got an XML parsing error at line 18625 "Namespace prefix pic on bodyPr is not defined". I do not know what that means nor how to fix it. Had I been able to fix it I think the file would then have opened. There have been other posts - see search on namespace prefix

Very soon after "Схема:" a <mc:AlternateContent> tag appears which means what follows is not part of the OOXML standard. The error causing the problem is within the AlternateContent about 120 lines later. The AlternateContent is probably an MS Textbox.



The only thing I could do was strip all the XML tags to recover all the user text in the file. It is completely unformatted but it does have all the user text. If you want the images just unzip the .odt file and look in the media folder.

NB This file is complicated by having an MS Textbox (an MS Textbox is part of MS Draw). LO supports and displays MS Textboxes and their content but AOO neither supports nor displays MS Textboxes. Hence, regardless of the error, AOO would never have displayed anything between the <mc:AlternateContent> and the </mc:AlternateContent> tags.

See [Tutorial] Differences between Writer and MS Word files for why you should always work in and save files as .odt.

Re: Corrupted file (SAXParseException error)

PostPosted: Mon Jan 21, 2019 9:05 am
by a1ro
Many thanks to all who responded.
"John_Ha" you helped me a lot!

Re: Corrupted file (SAXParseException error)

PostPosted: Mon Jan 21, 2019 1:55 pm
by Lupp
I probably got a bit more repaired. Check the linked file to find out. Of course, there are still missing a few objects, but text (56 pages), structure, and formatting should widely be preserved. ... tegory.odt

Please tell me if I shall remove the files (the one linked here and the one linked yesterday) from public access.

Re: Corrupted file (SAXParseException error)

PostPosted: Mon Jan 21, 2019 2:23 pm
by a1ro
Lupp, thank you so much! The document looks like the original. Thanks again to all those who responded!