[Hint] How did I fix my ODT file

Discuss the word processor
Post Reply
ajut001
Posts: 2
Joined: Thu Jan 10, 2008 1:27 am

[Hint] How did I fix my ODT file

Post by ajut001 »

Hello,

I had problem with 20 pages long ODT file (text and pictures). Problem was: when I tryed to open it,
I got message "Error reading file" under OO 2.3.1 (both linux and windows versions).

It took some hours to figure out, how to fix it, so I want to share my solution with other OO users :)

First, let's call our nonopening ODT file as "bad.odt".
  1. make backup FIRST -> "$ cp bad.odt bad_original.odt"
  2. make new directory-> "$ mkdir repair"
  3. copy bad.odt to repair directorty "$ cp bad.odt repair"
  4. change default directory to repair -> "$ cd repair"
  5. unzip bad.odt -> "$ unzip bad.odt"
  6. after unzipping you get bunch of files and directory's under repair , find content.xml and open it whit your favorite text editor -> "$ kate content.xml"
  7. use "find" function to find out, if you have XML tag "<office:automatic-styles>" (somewhere at the beginning of document) and XML tag "</office:automatic-styles>" (somewhere, middle of document). If you have, then delete them and all data between them. Be sure, that you don't delete more or less!
  8. save content.xml (keep original name and place!)
  9. zip extracted data back to one ODT document -> "$ zip -r ./bad_repaired.odt ./*"
  10. try to open repaired document -> "$ ooffice ./bad_repaired.odt"
... and if are you lucky, then OO is able to open your document again ;)

Well, I got back my text and pictures, but the price was - no styles (font size; bold; heading etc...)

If your document do not opening and you get message like "Format error discovered in the file in sub-document context.xml at ...",
then you broke XML structure and must go back to "STEP ONE" ant try to be more careful with deleting things.

PS1: if you get CRC errors, when unzipping ODT, then my solution probable can't help you :(
PS2: I tried also insert "bad.odt" into new document, but still got "Error reading file" message :(
PS3: and " META-INF/manifest.xml file " trick did not help also :(

If someone what to investigate my broken ODT file, then it can be downloaded from -> http://adsl213.pointclark.net/Eksam.odt

BR,
Ajut
Last edited by TerryE on Thu Jan 10, 2008 5:14 am, edited 1 time in total.
Reason: Changed title to add Hint and used numbered list bbcode for readability
TerryE
Volunteer
Posts: 1402
Joined: Sat Oct 06, 2007 10:13 pm
Location: UK

Re: [Hint] How did I fix my ODT file

Post by TerryE »

Note that I changed the title of your post. (admins can so that :-P)

What Ajut describes is that it is possible to create a ODT file (due to bugs in writer) which may not then load back into OOo. This is often due to invalid styles and that by going into the ODT with a ZIP tool and editing the raw XML content.xml as he desribes, then you can often recover the majority content. If you want you could also use a binary chop to find out which of the actual styles is causing the problem. Now this takes a liitle more time, but you so then recover (almost) the entire cotent.

In this case the problem was to do with three offending styles T155, T159 and T162 which are used to frame three of his formulae. These all have the same problem: They use a style:text-position attribute within a style:text-properties tag to place the text on the line. According to the ODF spec:
  • Use the style:text-position formatting property to specify whether text is positioned above or below the baseline and to specify the relative font height that is used for this text. This attribute can have one or two values.

    The first value must be present and specifies the vertical text position as a percentage that relates to the current font height or it takes one of the values sub or super. Negative percentages or the sub value place the text below the baseline. Positive percentages or the super value place the text above the baseline. If sub or super is specified, the application can choose an appropriate text position.

    The second value is optional and specifies the font height as a percentage that relates to the current font-height. If this value is not specified, an appropriate font height is used. Although this value may change the font height that is displayed, it never changes the current font height that is used for additional calculations.
So a subscript might have style:text-position="-50% 50%". The spec doesn't lay down any limits for the text position, but having had a play around the writer document loader only accepts value for the sub rang <=101% (yes, that extra 1 is bizarre) These three styles have sub values of -223%, -116% and -125%. Setting these to -100% and the doc loads fine.

So what we have is a bug in wrtier. I need to have a bit more of a play and raise this one as an Issue.

BTW Ajut, you will find it a lot easier to do this sort of doc if you use styles rigorously. Set up a text style called eqn, say, where you have no language (and hence you don't get spelling errors for you symbols and default font and emphasis that makes yr equations stand out, and use a keyboard shortcut to set the styles you want.
Ubuntu 11.04-x64 + LibreOffice 3 and MS free except the boss's Notebook which runs XP + OOo 3.3.
ajut001
Posts: 2
Joined: Thu Jan 10, 2008 1:27 am

Re: [Hint] How did I fix my ODT file

Post by ajut001 »

First: Thanx TerryE for very quick and professional replay to my post :)

I made recommended replacements ( see below) in original "content.xml" and got my layout back :).

Original document was created with is copy-pasted process from different PPT files in OOo. Only changes
user made, was font styles and -sizes. So it may be a bug in copy-paste layer.
User did not tried to reopen (save->close->open) document during creation process and so she got
error next time, when tried to open saved document.

BR,
Ajut

Replacements:
---
<style:text-properties style:text-position="-283% 100%" -> <style:text-properties style:text-position="-100%"
<style:text-properties style:text-position="-116% 100%" -> <style:text-properties style:text-position="-100%"
<style:text-properties style:text-position="-125% 100%" -> <style:text-properties style:text-position="-100%"
---
TerryE
Volunteer
Posts: 1402
Joined: Sat Oct 06, 2007 10:13 pm
Location: UK

Re: [Hint] How did I fix my ODT file

Post by TerryE »

Ajut,

Humm, how do you fancy helping me track down a rather nasty bug in writer. We regularly get people who say "help me, I suddenly can't read my (usually ODT) file", but without hard test cases it is difficult to replicate the bug(s) and therefore impossible to get the developers on the case. If we can identify and eliminate this one then we will help a lot more of users than this one. So now you've added two more bits of information: (a) a possible path by which the corruption occurred, and (b) the inference by your reference to "user" that are probably some form of IT support guy, so we can probably have a deeper conversation on this.

I've been looking at the code for the XML exporter and the XML importer. It seems to be using a standard framework which is generated from the XML DTD with a whole load of stub to do the filling in so that the internal structures can be mapped to XML and visa-versa. The issue is that if the outbound validation is a lot less lax than the inbound (why bother validating the outbound — its valid already, isn't it? Hence you can get into the situation where you can create content which you can save, but not then reload on next load of the document. I suspect that the assumption is that you can't create invalid data because the GUI has the validation that you need, but what if you create the content by pasting in Rich text and thereby bypass the normal GUI?

OK, I'll have a play to see if I can generate a synthetic case, but could you possibly ask a favour and see if your user has the original PPT/DOC that created Section 67 : "Sirge parameetrilised ja kanoonilised võrrandid". If so let me know. If necessary we shoul be able to cut it down to that one slide / page as a test case. I'll mail you my mail ID, if you want to send me any attachments. If you want to pass any private material then you can do it via that email, and we can post a public highlight later.
Ubuntu 11.04-x64 + LibreOffice 3 and MS free except the boss's Notebook which runs XP + OOo 3.3.
TerryE
Volunteer
Posts: 1402
Joined: Sat Oct 06, 2007 10:13 pm
Location: UK

Re: [Hint] How did I fix my ODT file

Post by TerryE »

I managed to use an RTF to create the same effect. See http://qa.openoffice.org/issues/show_bug.cgi?id=76465.

The issue is badly titled but this is the same underlying bug.
Ubuntu 11.04-x64 + LibreOffice 3 and MS free except the boss's Notebook which runs XP + OOo 3.3.
rodrigo
Posts: 1
Joined: Tue Jan 22, 2008 11:19 pm

Re: [Hint] How did I fix my ODT file

Post by rodrigo »

Great post!!!

It worked perfectly with me!! (the thing about <office:automatic-styles> tags).

I had a big, important ODT file, which was the result of many back and forth editions with MS Word (Office 2003) and Writer (2.3.1). I have tried several things, also from other forums, but nothing worked.

I deleted all in between the <office:automatic-styles> .. </office:automatic-styles>
AND NOW I CAN OPEN THE FILE!!!

thanks a lot, this helped me A LOT!!! :D :D

saludos,
rodrigo
Anodos12
Posts: 5
Joined: Wed Mar 05, 2008 12:07 am

Re: [Hint] How did I fix my ODT file

Post by Anodos12 »

I didn't try this method, but I did get this message:

"Format error discovered in the file in sub-document content.xml at position 2,155278(row,col)."

Can this problem be fixed vis a vis the method described here, or does the broken xml code need repair in some other way?
Anodos12
Posts: 5
Joined: Wed Mar 05, 2008 12:07 am

Re: [Hint] How did I fix my ODT file

Post by Anodos12 »

Crap, didn't read carefully enough: my file is an .odp file, not an .odt file.
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Hint] How did I fix my ODT file

Post by acknak »

It doesn't matter, all the ODF file formats share the same basic structure, so the same approach can work for any of them.

If you want to attach your file, we can try to help.
AOO4/LO5 • Linux • Fedora 23
Anodos12
Posts: 5
Joined: Wed Mar 05, 2008 12:07 am

Re: [Hint] How did I fix my ODT file

Post by Anodos12 »

Great, yes, thank you, this took hours of work, I greatly appreciate it.

Actually, I can't, the file is 259 KB.
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Hint] How did I fix my ODT file

Post by acknak »

You can use one of the free file sharing sites, such as filecrunch.com or mediafire.com.
AOO4/LO5 • Linux • Fedora 23
Anodos12
Posts: 5
Joined: Wed Mar 05, 2008 12:07 am

Re: [Hint] How did I fix my ODT file

Post by Anodos12 »

User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Hint] How did I fix my ODT file

Post by acknak »

Try this one: Signs_recovered.odp
AOO4/LO5 • Linux • Fedora 23
Anodos12
Posts: 5
Joined: Wed Mar 05, 2008 12:07 am

Re: [Hint] How did I fix my ODT file

Post by Anodos12 »

Wonderful. If you're ever in Chicago and want a deep dish pizza on me, just shoot me an email. You don't even have to meet me, I'll just have it delivered to your hotel. :P
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Hint] How did I fix my ODT file

Post by acknak »

Just FYI: the document had some invalid character data on slide 92. All I had to do was replace that invalid text with some valid characters and the document could be opened correctly. Of course, I have no idea what the correct text on slide 92 should be, so you'll still have to fix that.

It only took a couple of minutes; simple fix.
AOO4/LO5 • Linux • Fedora 23
Bostonaholic
Posts: 3
Joined: Tue Aug 12, 2008 10:51 pm

Re: [Hint] How did I fix my ODT file

Post by Bostonaholic »

acknak, if I upload an ODS file, do you think you could try and fix it for me? It is a calc file that is VERY important. I've tried the described method and it does not seem to work.

Thanks.
OOo 2.3.X on Ms Windows XP + Ubuntu 8.04
User avatar
Hagar Delest
Moderator
Posts: 32594
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: [Hint] How did I fix my ODT file

Post by Hagar Delest »

Just do it, we can try.
LibreOffice 7.6.2.1 on Xubuntu 23.10 and 7.6.4.1 portable on Windows 10
Bostonaholic
Posts: 3
Joined: Tue Aug 12, 2008 10:51 pm

Re: [Hint] How did I fix my ODT file

Post by Bostonaholic »

OOo 2.3.X on Ms Windows XP + Ubuntu 8.04
User avatar
Hagar Delest
Moderator
Posts: 32594
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: [Hint] How did I fix my ODT file

Post by Hagar Delest »

Well, there are a lot of encoding errors, can't fix them all on my old laptop (takes too much time). You should try yourself with a good XML parser, that's not that difficult.

NB: you can upload files here if they are smaller than 128KB.
LibreOffice 7.6.2.1 on Xubuntu 23.10 and 7.6.4.1 portable on Windows 10
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Hint] How did I fix my ODT file

Post by acknak »

Ok, here's a recovered file.

This was a strange one, and took quite a bit of work to fix. The "diff" document contains a list of all the things that I changed to try and fix the file. I make no guarantee that there are no more errors. In fact, if you really care about the data, you'll throw this out and go back to your last known-good backup. You do have a backup, don't you? Ok, ok, this is probably salvageable if you don't have a backup, but you'll want to be very careful because there are still errors in your data (there are a few I found but did not change).

Here is a sample of the errors in the file:
diff.png
The light-gray text is context; the darker text is lines that have changes. The "-" lines contain the errors; the "+" lines are the fixed version. The orange-highlights show the errors.

I only fixed the errors in the XML code; I left your data unchanged. Some of the XML errors were enough to prevent the file from loading, some were not, so there's no guarantee that I've found all the problems, but the file does pass "xmllint" and the ODF Validator (http://tools.services.openoffice.org/odfvalidator/), so that's a fairly strong indication that the file structure is ok.

Here are the odd bits:
• first, all the errors are all single-character differences, and the character code of the error is always one less than the correct character code.
• second, the errors are bunched in two small parts (each about 70 lines, or 0.7% ) of the file: lines 3078-3143 and 3278-3385

This does not look like random memory corruption; I have no idea what might cause this pattern of errors.

However, because there are no errors in the XML from the rest of the file, it may be that your data and formulas are ok as well. You just need to carefully check the context lines (light gray) in the diff to see if there are any problems in your content.

There are some: I saw some very suspicious spelling errors, which I highlighted in green in the diff (also visible in the sample image above). If you agree that those are errors, you'll need to fix them in the spreadsheet. Scan the other context lines in the diff document to see if there are any other problems. Remember, the other errors all seem to be one-off, so a change in a number, say from 100 to 000 could be rather hard to see.

And if you have any ideas what circumstances might have triggered this, I'd be interested to hear about it.
Attachments
diff.odt
(14.9 KiB) Downloaded 21625 times
Last edited by acknak on Wed Aug 13, 2008 8:45 pm, edited 1 time in total.
Reason: Removed confidential attachment
AOO4/LO5 • Linux • Fedora 23
Bostonaholic
Posts: 3
Joined: Tue Aug 12, 2008 10:51 pm

Re: [Hint] How did I fix my ODT file

Post by Bostonaholic »

Wow, thank you so much acknak!!! You're a savior.

I had been working in the content.xml and found those weird one-off errors everywhere. It was taking me forever to get through all of them and I thought I found them all but it still wouldn't open. I haven't a clue what happened, maybe it was because they used to be xls then I converted them to ods??? But they had been working fine as ods for a while so who knows.

Again, thanks.
OOo 2.3.X on Ms Windows XP + Ubuntu 8.04
soti
Posts: 2
Joined: Fri Jan 09, 2009 12:08 pm

Re: [Hint] How did I fix my ODT file

Post by soti »

Dear all,

I have a similar problem. I have a 100 page document which is going to be a book, full of formulas, which I can't open after I saved it regularly with OO writer 3.0.0. I have tried the previous ideas - removed the auto styles from context.xml. I have also tried the ODF validator - which gives me an error:

:Fatal:SAXException:Attribute name "manifest:full-x" associated with an element type "manifest:file-entry" must be followed by the ' = ' character.

It seems to me that somewhere is missing an = character. Does anyone has some ideas on where the error might be?
I have checked the META-INF/manifest.xml, but it's around 1.6 MB so I cannot search through it manually...

Gergely
OOo 3.0.X on openSuse 11
soti
Posts: 2
Joined: Fri Jan 09, 2009 12:08 pm

Re: [Hint] How did I fix my ODT file

Post by soti »

Hello,

I fixed the problem, but I thought that I should share the solution with everyone. I downloaded RXP (an XML parser) available for both Windows and Unix. With it I scanned all the xml files I got by unzipping the original ODT document. Somewhere in the file META-INF/manifest.xml was an error - like some bytes were changed to other, it looked like:
manifest:full-x@g@+@t@ject 615/Configurations2/progressbar/

instead of:

manifest:full-path="Object 615/Configurations2/progressbar/

So i just changed it back and it worked fine. I don't know what could have caused the problem - I use a brand new computer, the newest openSuse, the hard disk didn't cause me problems before.

regards,
Gergely
OOo 3.0.X on openSuse 11
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Hint] How did I fix my ODT file

Post by acknak »

Nice work! Thanks for the information.
AOO4/LO5 • Linux • Fedora 23
goatsxc
Posts: 3
Joined: Tue Feb 03, 2009 8:06 am

Re: [Hint] How did I fix my ODT file

Post by goatsxc »

I have two Calc files with the similar problems ('format error discovered in the sub-document context.xml at 2,19926871(row,col)). I saved them earlier this afternoon, when they were working fine. Currently I can't open either Calc document and I've tried the steps in the initial post (I'm having trouble rezipping the files - I get some sort of error with 7zip). I was hoping I someone in the community could take a look at the Calc files and see if they are fixable (important data I'm obviously hoping to recover):

http://www.mediafire.com/?sharekey=40fa ... f6e8ebb871

Thanks in advance!
OOo 3.0.X on MS Windows Vista
goatsxc
Posts: 3
Joined: Tue Feb 03, 2009 8:06 am

Re: [Hint] How did I fix my ODT file

Post by goatsxc »

i think i correctly unzipped, edited the xml file, and then rezipped things. when i try to open the new ods file, however i get this error message:

format error discovered in the sub-document $(ARG1) at $(ARG2)(row,col)

there doesn't seem to be much info about it on the forums or when i search google. any ideas?
OOo 3.0.X on MS Windows Vista
pamindic
Posts: 1
Joined: Thu May 27, 2010 1:39 pm

Re: [Hint] How did I fix my ODT file

Post by pamindic »

Ajut001's instructions to remove the automatic styles section from contents.xml worked for me to recover my corrupt .ods file.
Much appreciated.
OpenOffice 3.2.0 Ubuntu 10.04
anonymouschick
Posts: 1
Joined: Tue Nov 23, 2010 9:33 pm

Re: [Hint] How did I fix my ODT file

Post by anonymouschick »

Hey guys, I had the same problem this morning and I'm still working on it, I have just one question about the way you fixed this, did you put those on a terminal? I'm sorry if I'm too new at this, I'm just desperate to get back my file and as far as I've looked this seems to be the best way to get it done, except that I might be confusing something because I tried using a terminal, but I guess I'm far beyond wrong (I'm on Linux).
Please if you can, help me
OpenOffice 3.2 on Ubuntu 10.10
User avatar
Hagar Delest
Moderator
Posts: 32594
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: [Hint] How did I fix my ODT file

Post by Hagar Delest »

No terminal needed.
LibreOffice 7.6.2.1 on Xubuntu 23.10 and 7.6.4.1 portable on Windows 10
garbledtext
Posts: 2
Joined: Thu Dec 09, 2010 1:08 am

Re: [Hint] How did I fix my ODT file

Post by garbledtext »

Hello all,

I am a bit of a computer layman, so please be patient. After replacing a .png image in my .odt document and then deleting the original in from its folder OOo promptly froze then crashed and now upon opening it I'm getting the error "format error discovered in the file in sub-document context.xml" except mine ends with "at 1,0(row,col)". Found some people with a similar error and they all seem to direct you to here.

To start off, I found the OP's command-line style advice to be a little cryptic. I gleaned from this advice that I was supposed to zip, then unzip the file, and upon unzipping the folder would contain content.xml. Initially I used XP's built zip utility and zipped it that way. When I unzipped it I simply got back the original file: "fixme.odt". Hmmm... I don't know why I expected something different to happen...

I figured I must be doing something wrong (possible stemming from my lack of understanding of what XP's built in zip utility is capable of) so I downloaded WinZip. I read on a different forum that the .odt file is actually a compressed file already, so I simply opened the .odt file in winzip and extracted straight from there. Voila, I get all the appropriate files, including content.xml.

The problem I'm having now is that when I try to open content.xml in Notepad or Wordpad I get crazy garbled characters with those squares and all those funny foreign currency symbols. I take this to be a very bad sign. I read online that this type of text means that the file contains binary information rather than text, but I don't know how valid that is or if that information is at all helpful. I don't care about the formatting, I simply want the content back. Is there any way to get it or am I screwed?
OOo 2.4.1 on Windows XP SP2
Post Reply