Format error discovered in the file in sub-document

Discuss the spreadsheet application
Post Reply
koko233
Posts: 1
Joined: Thu Feb 25, 2021 4:59 pm

Format error discovered in the file in sub-document

Post by koko233 »

Hello,

I have an error with an OpenDocument Calc spreadsheet that prevents me from opening it:
"Format error discovered in the file in sub-document content.xml at 2,291085(row,col)."
I tried a variety of ways to fix it myself, including unzipping the file, deleting the entire <office: automatic-styles> section with Notepad++ and re-zipping again, but to no avail.

If anyone can help me with something I would be very grateful.
I have attached a link to the spreadsheet. This is the original file without any changes.

Thank you in advance.

https://drive.google.com/file/d/1Ocxzsui9WttM1lH6TNREyaQAWIeHwic-/view?usp=sharing

Disabled live link
OpenOffice 4.1.9
Windows 10 20H2 x64
John_Ha
Volunteer
Posts: 9584
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: [Solved] Format error discovered in the file in sub-docu

Post by John_Ha »

 Edit: The post to which this refers has been deleted by the poster. 
Your file is very badly corrupted and well beyond repair or recovery of any user data.

The ZIP container reports a CRC error on content.xml. content.xml has been truncated and an unknown quantity of XML coding is missing.

Much of the existing XML is very badly corrupted throughout. I do not know a way of recovering any user data as there is so much corruption.

The last few lines show the sort of problems: corrupted tags and then the XML is truncated.

I cannot even begin to guess how this happened as there is so much corruption throughout the entire file.

Code: Select all

<table:table-ce1" office:value-type="float" office:valu785ice:date-value=019-03-2326:<tex327p>10.32e-cell office:value-type="float" office:valu6771.19999999998table-cell6771,2table:table-cell table:style-name="ce57" offi:="floa:table-cell><table:table-11ble-cell table:style-name="ce57" ofe-nam5ell office:value-1e-type="float" fice1" office:value-type="floa:table-cell><table:table-ce1" office:value-type="float" office:valu785ice:date-value=019-03-2327:<tex328p>10.32l table:style-name="ce30" office:value-type=6771.19999999998table-cell6771,2table:table-
Each block of colour is corruption and there are many futher corruptions within what looks like good code.
content.xml.gif
You need to do a thorough test of both memory and disk to see if it is a hardware problem. However, if you want to recover the file do this first:

See [Tutorial] How to find and un-delete AOO temporary files for detailed instructions on how to

a) use Previous Versions (W7 and later) to recover previous versions of the file (is there something similar on MacOS and Linux?);

b) recover your file as it was when you last opened or saved it, or as it was when it was last saved with AutoRecovery;

c) find previous versions of the file in the folder it is located in, but which have since been deleted;

d) find any temporary files AOO wrote while you were editing the file but which have not yet been deleted;

e) un-delete the temporary files AOO wrote while you were editing the file, and then deleted. d) and e) will recover your file as it was when you last opened or you last saved it.

If you cannot follow the instructions ask someone with more PC skills to help you. Act quickly - the longer you wait the more likely any temporary files are to be deleted.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
User avatar
Hagar Delest
Moderator
Posts: 32657
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: [Solved] Format error discovered in the file in sub-docu

Post by Hagar Delest »

John_Ha wrote:
 Edit: The post to which this refers has been deleted by the poster. 
No, I was splitting the post from an old topic while you were answering it!
Just a time issue.
Post merged in the new topic.
LibreOffice 7.6.2.1 on Xubuntu 23.10 and 7.6.4.1 portable on Windows 10
User avatar
ardovm
Posts: 6
Joined: Thu Aug 01, 2019 11:13 am

Re: [Solved] Format error discovered in the file in sub-docu

Post by ardovm »

John_Ha wrote:...
Your file is very badly corrupted and well beyond repair or recovery of any user data.

The ZIP container reports a CRC error on content.xml. content.xml has been truncated and an unknown quantity of XML coding is missing.
IMHO this alone makes it impossible to recover the data.

It looks like content.xml is at the middle of the archive. If it was at the end, I would have suspected a truncation of some sort; but this is different.

The CRC32 is stored into the ZIP file besides the compressed data. It means that OpenOffice calculated the CRC, compressed the data and then wrote... something else into the file.

What OpenOffice version generated this problem?
OpenOffice 4.1.13 on various platforms
John_Ha
Volunteer
Posts: 9584
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: [Solved] Format error discovered in the file in sub-docu

Post by John_Ha »

ardovm wrote:What OpenOffice version generated this problem?
meta.xml contains the following suggesting it was OpenOffice/4.1.6$Win32 OpenOffice.org_project/416m1$Build-9790

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<office:document-meta xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" xmlns:ooo="http://openoffice.org/2004/office" xmlns:grddl="http://www.w3.org/2003/g/data-view#" office:version="1.2">
	<office:meta>
		<meta:generator>OpenOffice/4.1.6$Win32 OpenOffice.org_project/416m1$Build-9790</meta:generator>
		<meta:creation-date>2009-04-16T11:32:48Z</meta:creation-date>
		<dc:date>2021-02-10T14:32:58.43</dc:date>
		<meta:print-date>2018-08-17T18:24:03Z</meta:print-date>
		<meta:editing-cycles>4129</meta:editing-cycles>
		<meta:editing-duration>P19DT16H40M2S</meta:editing-duration>
		<meta:document-statistic meta:table-count="4" meta:cell-count="19744" meta:object-count="0"/>
		<meta:user-defined meta:name="Info 1"/>
		<meta:user-defined meta:name="Info 2"/>
		<meta:user-defined meta:name="Info 3"/>
		<meta:user-defined meta:name="Info 4"/>
	</office:meta>
</office:document-meta>
Strangely, only content.xml was corrupt - all other files in the zip container tested OK and appeared intact. The first 20% or so of content.xml looked OK but the corruption then began and continued throughout until the truncated end.

It is quite bizarre. I have repaired many files but I have never before seen a file as badly corrupted as this one.

Does "<meta:editing-cycles>4129</meta:editing-cycles>" mean it has been edited 4,129 times; and "<meta:creation-date>2009-04-16T11:32:48Z</meta:creation-date>" mean it was created in 2009? We often see text documents which have been edited many (thousands) times become "tangled" (we don't know what tangled means). Inserting them into a new file reduces their size and improves their responsiveness. See examples here.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
User avatar
ardovm
Posts: 6
Joined: Thu Aug 01, 2019 11:13 am

Re: [Solved] Format error discovered in the file in sub-docu

Post by ardovm »

John_Ha wrote:
ardovm wrote:What OpenOffice version generated this problem?
meta.xml contains the following suggesting it was OpenOffice/4.1.6$Win32 OpenOffice.org_project/416m1$Build-9790
Thank you for checking!

This makes debugging more difficult as we would need to understand if whatever caused this bug was already fixed in 4.1.9.
Does "<meta:editing-cycles>4129</meta:editing-cycles>" mean it has been edited 4,129 times;
Yes.
and "<meta:creation-date>2009-04-16T11:32:48Z</meta:creation-date>" mean it was created in 2009?
Yes.
We often see text documents which have been edited many (thousands) times become "tangled" (we don't know what tangled means). Inserting them into a new file reduces their size and improves their responsiveness.
This may be related... or not... who knows :-( But thank you for pointing this out.
OpenOffice 4.1.13 on various platforms
Post Reply