I have been doing some analysis and I can now cause AOO Writer to add the
office:name corruption to a .odt file although it may only be doing so after the .odt file has been processed with MS Word. That being said, MS Word may not even be involved although I strongly suspect it is.
I extracted content.xml from the poster's Sammy Russel 1draft.odt file. Note that the P1 Style definition has been corrupted and redundant and incorrect office:name="__Annotation__153_24419901911111111" office:name="__Annotation__158_2441990191111" office:name="__Annotation__248_244199019111111" office:name="__Annotation__401_244199019111" office:name="__Annotation__414_24419901911" has been inserted into the P1 Style definition.
I deleted these redundant items and re-inserted content.XML to get the attached file
Sammy Russel 1draft - CORRECTED.odt. At this stage I thought that the .odt file must be OK.
I opened Sammy Russel 1draft - CORRECTED.odt and the file opens without problem. I then made a trivial edit (add a space in front of Case Summary) and saved the file.
When I opened the saved file I got the Read Error message and, much to my surprise, Writer had corrupted the P1 Style definition by inserting one or more
office:name definitions into the P1 Style definition.
I think the problem occurs:
- when Record Changes is being used;
- when changes are made using MS Word;
- where the MS Word changes include one or more comments attached to a range of characters;
- where the file is then edited by AOO;
- where the file is then saved by AOO.
I think it is when AOO saves the file at the final step that the corruption is inserted. Now, it could be that AOO has problems with something MS Word has written to the file, and it is this which causes AOO to write the corruption. Or it could be that MS Word does nothing untoward and that AOO just corrupts the file.
Notes:
1. It appears that the file was created by poster, author SN, using AOO Writer. The file was sent to reviewer SD who used MS Word and recorded changes on 20 Mar 2018. Some changes were "Comments attached to a range of characters" and it is these Comments which use the office:name definitions.
2. Author SN then recorded more changes to the file using AOO on 22 Mar. In fact, Record Changes is still ON in the poster's file.
3. At some stage, the poster's file became corrupted. I think this probably happened when author SN edited and saved the file after it had been edited with MS Word.
4. Analysis of the time stamps of the edits shows that each change is timed at nn:nn:00.0n seconds. It seems strange to me that the time is always set to 00.0n seconds. The times are shown below where 20 = date 20th.
The first five office:name ... appear in the file, and also corrupt the P1 Style definition. The sixth, seventh and eight appear in the file but do NOT corrupt the P1 Style definition. The sixth was the first, recorded at 09:51:00.02.
The other twenty times are recorded changes which were not Comments added to a range of characters. Note that the same 12:18:00.06 time is recorded for two different changes.
Note the multiple adding of digits "111...".
Note how the decimal component of the seconds increments throughout - I would expect it to be more random.
The times below are in the order in which they appear, from start to end, in content.xml.
Code: Select all
office:name="__Annotation__153_24419901911111111" line 200 20 9:56:00.04 SD
office:name="__Annotation__158_2441990191111" line 220 20 9:57:00.04 SD
office:name="__Annotation__248_244199019111111" line 351 20 10:39:00.04 SD
office:name="__Annotation__401_244199019111" line 859 20 12:18:00.06 SD
office:name="__Annotation__414_24419901911" line 958 20 12:20:00.06 SD
office:name="__Annotation__3_244199019" line 1260 20 9:51:00.02 SD
office:name="__Annotation__396_244199019" line 1522 20 12:18:00.06 SD
office:name="__Annotation__551_244199019" line 1636 20 12:50:00.08 SD
09:54:00:04
11:50:00.04
10:43:00.05
10:41:00.05
12:21:00.05
11:40:00.05
11:52:00.05
11:56:00.06
12:43:00.06
12:18:00.06 line 816
12:27:00.06
12:29:00.07
12:28:00.07
12:39:00.07
12:40:00.07
12:42:00.08
12:42:00.08
12:44:00.08
12:46:00.08
12:50:00.08
I have updated the bug report with these findings. Unfortunately I do not have MS Word to do any more diagnosis.
I have attached content.odt which is content.xml - it is too big to be brought in as text in the post.