Can't fix content.xml error, maybe tracked-changes?

Discuss the word processor
Post Reply
dreamshade
Posts: 3
Joined: Fri Apr 01, 2022 12:07 pm

Can't fix content.xml error, maybe tracked-changes?

Post by dreamshade »

I'm working on a copyediting homework assignment. The professor gave me a 30-page DOCX file, and I'm working on editing and formatting the text. I'm required to show Track Changes history for this assignment. I saved the file as ODT for editing in LibreOffice.

LibreOffice crashed and failed to restore the ODT, and I now get a content.xml error at (2,327205) when I try to open the ODT.

I found multiple threads on these forums describing similar errors. I downloaded Notepad++ and XML Copy Editor as described in this thread, and I tried to mess with the unzipped data. Here's what I found in content.xml:

— 1. The column number points to the front of a <text:span> tag in the file. I deleted that tag and rebuilt the ODT. This doesn't fix the error.

— 2. I deleted all of the text in the file after this point. I removed all of the body text from column 327205 up to the </office:body> tag at the end of the document. This doesn't fix the error.

— 3. I deleted some more text from the end of the file. When I rebuilt and opened the ODF, the content.xml error gave a new column number. That column was literally the last column in content.xml.

— 4. I deleted everything in the <office:automatic-styles> tag as described in another post. This had no effect. I also searched for the multiple Officename error and couldn't find anything.

— 5. I re-saved the original DOCX file as an ODT, unzipped it, copied everything in the content.xml before the first <text:p> tag, and pasted that over the header of my current file. This let me load the ODT with no errors, but all of my Track Changes history was gone.

— 6. I reverted to my ODT file's last error save and tried editing random things. When I deleted everything in the <text:tracked-changes> tag, I was able to load the ODT with no errors and no Track Changes history.

I'm guessing that one of the <text:changed-region> tags in the <text:tracked-changes> section is corrupt. I think one of the tracked-changes tags points to the body text at a point with no change history. But I've got 2400 lines of tracked-changes in the Notepad++ window, and I can't figure out how the tracked-changes tags link to comments in the body.

I've only lost four or five hours of work, so it's okay if I can't restore the file. Since I figured out how to load the ODF without the Track Changes history, I can at least compare files and copy my changes over. But I spent a couple hours deciphering this file, and I hate to let it go now. Any ideas?
Attachments
A2_draft01.odt
(65.79 KiB) Downloaded 136 times
Last edited by dreamshade on Fri Apr 01, 2022 8:18 pm, edited 1 time in total.
LibreOffice 7.2.1.2 (x64), Windows 10
John_Ha
Volunteer
Posts: 9604
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Can't fix content.xml error, maybe tracked-changes?

Post by John_Ha »

There is something wrong with the file but I cannot see what.

I get error at (2,327205) in content.xml when I open it with LO but I cannot see anything obviously wrong anywhere near 327205.

When I check content.xml in Notepad++ there are no XML errors.

When I attempt to pull in the file into a new LO document it crashes LO.

When I try to open it with AOO I get Read Error - Error reading file but I am not given a specific location so it does not behave like a normal Format error discovered in sub-document problem.

As LO does not seem to create Format error discovered in sub-document problems I think this is a bug in the XML caused by LO (or the supervisor's MS Word).

It would help to see the original .docx file to see if the error can be tracked down in it.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Can't fix content.xml error, maybe tracked-changes?

Post by RoryOF »

@John_Ha: _all_ the internal .xml files check out for me with XML Content editor. The zipped archive that forms the .odt file is also complete and shows no error.

Attempting to Insert the file into a blank OO document, I get "General Error"

It certainly has the fingerprints of a former .doc/.docx format on it. I checked for our old friend the delete range marker, but no trace of that.
 Edit: That the file is labelled "A2" suggests to me that it might be a mislabelled drawing file of some type.
Certainly it would be good to see the original .docx file; we might be able to get a reliable conversion to .odt to allow the re-formatting assignment be carried out.

Also try opening the .docx in AbiWord; it might then be possible to select the entire, Copy and Paste to Open- or Libre-Office. 
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
John_Ha
Volunteer
Posts: 9604
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Can't fix content.xml error, maybe tracked-changes?

Post by John_Ha »

It needs someone who understands XML better.

About the only thing I can offer is to recover the text content by removing all the XML tags.

I agree it has been a .docx file as it has WW... labels and shows the typical way MS butchers text where a single sentence is broken down into individually words where each has it's own style.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Can't fix content.xml error, maybe tracked-changes?

Post by RoryOF »

I got the file to open (30something pages) in AbiWord, and Copied/Pasted into OpenOffice (file attached); no track changes information seems to have transferred
Attachments
A2_draft01_recovered.odt
(36.58 KiB) Downloaded 140 times
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Can't fix content.xml error, maybe tracked-changes?

Post by RoryOF »

Were I tasked with editing the .docx file, I would be tempted not to use "Track changes"; instead I would note my edit changes (other than trivial) by means of /Insert /Comment.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
Jan_J
Posts: 195
Joined: Wed Apr 29, 2009 1:42 pm
Location: Poland

Re: Can't fix content.xml error, maybe tracked-changes?

Post by Jan_J »

I can open and read the first attachment using LO 7.2 on Linux. It has 34 pages.
The file content.xml is valid XML as checked by xmllint.
JJ ∙ https://forum.openoffice.org/pl/
LO (26.2) ∙ Python (3.13|3.10) ∙ Unicode 17 ∙ LᴬTEX 2ε ∙ XML ∙ Unix tools ∙ Linux (Rocky|CentOS)
dreamshade
Posts: 3
Joined: Fri Apr 01, 2022 12:07 pm

Re: Can't fix content.xml error, maybe tracked-changes?

Post by dreamshade »

Here's the original DOCX file if that helps. (I've renamed it A2 because it's "Assignment #2." The filename doesn't mean anything.)

If the file is doing weird things with DOCX tags, will it help if I start a new ODT and copy/paste everything from the DOCX into it? Or will a copy/paste bring all of the bad DOCX tags in with it?
Attachments
A2_draft01.docx
(108.62 KiB) Downloaded 160 times
LibreOffice 7.2.1.2 (x64), Windows 10
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Can't fix content.xml error, maybe tracked-changes?

Post by RoryOF »

Here is an immediate open of the file, transferred to .odt format, with no editing on my part.

Thank you for confirming that the A2 is version number, not page size.
 Edit: I suggest installing and using the Timed-Dated backup extension while editing this, or indeed, any important file. 
Attachments
A2_draft01.odt
(50.85 KiB) Downloaded 129 times
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
dreamshade
Posts: 3
Joined: Fri Apr 01, 2022 12:07 pm

Re: Can't fix content.xml error, maybe tracked-changes?

Post by dreamshade »

I'll probably just need to start the thing over, but I appreciate y'all taking a look. I'll take a look at the backup extension, thanks.
LibreOffice 7.2.1.2 (x64), Windows 10
John_Ha
Volunteer
Posts: 9604
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Can't fix content.xml error, maybe tracked-changes?

Post by John_Ha »

dreamshade wrote:Here's the original DOCX file if that helps. (I've renamed it A2 because it's "Assignment #2." The filename doesn't mean anything.)
It opens fine for me with LO 7.1.8.1.

AOO is walking dead and you will need to swap to LO at sometime in the future. As LO has fixed a number of bugs which cause complete data loss in AOO (including the one where the document will not open if you delete two comments each attached to a range of text while Track changes in ON) I would recommend changing now. LO also has better compatibility with .docx files.

After 20+ years with OOo and AOO I changed to LO last year.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Post Reply