I am fairly new to OpenOffice.
I have a .doc file that was converted from a .PDF file. It reads fine in Office 2003 but when I try to read it with Writer I get ASCII garbage in non-text areas (pictures etc.).
Any help would be appreciated.
[Solved] ASCII garbage in non-text areas
[Solved] ASCII garbage in non-text areas
Last edited by joelstern on Tue Jan 15, 2008 12:33 am, edited 1 time in total.
Re: ASCII garbage in non-text areas
With which OOo version on which operating system?joelstern wrote:I am fairly new to OpenOffice.
With which tool? Can you provide an example?joelstern wrote:I have a .doc file that was converted from a .PDF file.
OOo 3.2.0 on Ubuntu 10.04 • OOo 3.2.1 on Windows 7 64-bit and MS Windows XP
Re: ASCII garbage in non-text areas
I have OpenOffice 2.3.0 on XP Professional. I have the copys of the pdf and the doc files. the file is too large to attach
Re: ASCII garbage in non-text areas
Please read the Survival Guide (see the link in my sig line below).joelstern wrote:I have OpenOffice 2.3.0 on XP Professional. I have the copys of the pdf and the doc files. the file is too large to attach
A .doc file converted from a .pdf file will always have extraneous and erroneous data in it. There's no tool yet that can make that conversion cleanly, not even Acrobat itself. It is a bit surprising that Word would hide that data, or perhaps is simply unable to display it.
There's no reason to upload the entire .doc file. Please just find a page that displays the problem, save it as a separate file, make sure it still displays the problems you're seeing, then upload that as a sample.
Cheers!
---Fox
OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit
---Fox
OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit
Re: ASCII garbage in non-text areas
I could not select the problem areas from the doc files directly. I've moved the doc document througe another conversion and produced the attached.
- Attachments
-
- blazeware sample1.odt
- (16.69 KiB) Downloaded 270 times
Re: ASCII garbage in non-text areas
Perhaps I should have said, "delete all the other pages." Would that have made a difference?joelstern wrote:I could not select the problem areas from the doc files directly.
Thank you, but I'm a little confused. I thought you were working with .doc files in Writer. This one's an .odt Writer file, so it would be better to be able to see a .doc example that displays correctly in Word but not in Writer.joelstern wrote: I've moved the doc document througe another conversion and produced the attached.
Also, you haven't yet answered hol.sten's question: What did you use to convert the PDF file? That could provide a clue or two as to what's going on with the file.
All I can tell you based on the attachment is that it appears the image data has been changed or removed so the program can't recognize or use the remaining data as an image. File signatures for JPG start out with FF D8 FF but the next bit should be either FE 00, or E1, according to my sources. The next bit in the file you have is E0 00, so I'm not sure that's a legitimate jpg code, but I'm also not sure how Word could display those images correctly if it isn't. I'll have to dig around some more and see if I can find it online. Maybe it's a Microsoft jpg format.
EDIT: Well, it is a valid header, for JPEGs in JFIF compliant format.
Cheers!
---Fox
OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit
---Fox
OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit
Re: ASCII garbage in non-text areas
I started with a seven page pdf. I emailed it to my son who has a full version of Adobe. He converted it to a doc file and ran it successfully with Office the emailed it to me. I could not read it and sent it back to him. He loaded my attachment and again read it successfuly.
I can't select any part of my copy of the .doc file with programs I have so I converted it to a txt file, cut and pasted the first few paragraphs, then saved it with Writer so I could show you something.
I can't select any part of my copy of the .doc file with programs I have so I converted it to a txt file, cut and pasted the first few paragraphs, then saved it with Writer so I could show you something.
Re: ASCII garbage in non-text areas
Oh, I see. Thank you!joelstern wrote:I started with a seven page pdf. I emailed it to my son who has a full version of Adobe. He converted it to a doc file and ran it successfully with Office the emailed it to me. I could not read it and sent it back to him. He loaded my attachment and again read it successfuly.
I can't select any part of my copy of the .doc file with programs I have so I converted it to a txt file, cut and pasted the first few paragraphs, then saved it with Writer so I could show you something.
Txt files can't work with images, so there's another extra layer that could affect what we're seeing in the Writer file.
I'd be happy to work with you privately if you wish. I have Acrobat 7.0 and Word 2003, so maybe we can re-create the file conversion to .doc and see what happens... and hopefully find a way to get you a file you can work with. Please PM me if you're interested in pursuing that route.
Cheers!
---Fox
OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit
---Fox
OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit