Page 1 of 1

OCR Conv. of a .pdf in a Word doc

Posted: Wed Jan 02, 2008 10:55 pm
by Rosina9
Hallo community,

where can I find a free-/shareware, with that it is possible to convert a .pdf file in German into a word.doc file.
running on a Vista basis? The other way round to convert from PDF to Word.doc was already highlighted in another issue log.

Most greatful for you input, brgds, Rosina

Re: OCR Conv. of a .pdf in a Word doc

Posted: Thu Jan 03, 2008 12:37 am
by foxcole
Rosina9 wrote:where can I find a free-/shareware, with that it is possible to convert a .pdf file in German into a word.doc file. running on a Vista basis? The other way round to convert from PDF to Word.doc was already highlighted in another issue log.
I'm afraid that not even Adobe Acrobat can do this very well. PDF is meant to be an end result, a "printed" version of a content file.

I think you'd get the best results from printing the PDF and scanning it in using OCR. Anything else just introduces a lot of formatting problems, because the PostScript code must be filtered and broken into a format that isn't PostScript. It's not an easy task especially with freeware or shareware tools.

That said, you might want to try Zamzar for starters. I believe it supports a variety of languages.

Re: OCR Conv. of a .pdf in a Word doc

Posted: Thu Jan 03, 2008 1:05 pm
by esperantisto
I have never heard of any free/shareware tool. Only commercial like SolidConverterPDF, ScanSoft PDF Converter, ABBYY PDF Transformer (the latter being, in fact, a stripped version of ABBYY FineReader OCR program). Neither of them is perfect due to the reason already cited — PDF was designed to be an “end” format.

Re: OCR Conv. of a .pdf in a Word doc

Posted: Thu Jan 03, 2008 1:25 pm
by ccornell
KOffice can import PDF documents right now, and does a fairly good job of it on the text portions. Sadly you're stuck using Vista. KOffice is Linux only right now (although there is a CygWin port being worked on). You could always upgrade to Linux and get rid of Vista :-)

There is a plan to have PDF import in OpenOffice.org as an extension: http://wiki.services.openoffice.org/wik ... _Extension