OCR Conv. of a .pdf in a Word doc

Discuss the word processor
Post Reply
Rosina9
Posts: 1
Joined: Wed Jan 02, 2008 10:45 pm

OCR Conv. of a .pdf in a Word doc

Post by Rosina9 »

Hallo community,

where can I find a free-/shareware, with that it is possible to convert a .pdf file in German into a word.doc file.
running on a Vista basis? The other way round to convert from PDF to Word.doc was already highlighted in another issue log.

Most greatful for you input, brgds, Rosina
User avatar
foxcole
Volunteer
Posts: 1507
Joined: Mon Oct 08, 2007 1:31 am
Location: Minneapolis, Minnesota

Re: OCR Conv. of a .pdf in a Word doc

Post by foxcole »

Rosina9 wrote:where can I find a free-/shareware, with that it is possible to convert a .pdf file in German into a word.doc file. running on a Vista basis? The other way round to convert from PDF to Word.doc was already highlighted in another issue log.
I'm afraid that not even Adobe Acrobat can do this very well. PDF is meant to be an end result, a "printed" version of a content file.

I think you'd get the best results from printing the PDF and scanning it in using OCR. Anything else just introduces a lot of formatting problems, because the PostScript code must be filtered and broken into a format that isn't PostScript. It's not an easy task especially with freeware or shareware tools.

That said, you might want to try Zamzar for starters. I believe it supports a variety of languages.
Cheers!
---Fox

OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit
esperantisto
Volunteer
Posts: 578
Joined: Mon Oct 08, 2007 1:31 am

Re: OCR Conv. of a .pdf in a Word doc

Post by esperantisto »

I have never heard of any free/shareware tool. Only commercial like SolidConverterPDF, ScanSoft PDF Converter, ABBYY PDF Transformer (the latter being, in fact, a stripped version of ABBYY FineReader OCR program). Neither of them is perfect due to the reason already cited — PDF was designed to be an “end” format.
AOO 4.2.0 (of 2015) / LO 7.x / Win 7 / openSUSE Linux Leap 15.4 (64-bit)
User avatar
ccornell
Volunteer
Posts: 611
Joined: Sun Oct 07, 2007 7:21 am

Re: OCR Conv. of a .pdf in a Word doc

Post by ccornell »

KOffice can import PDF documents right now, and does a fairly good job of it on the text portions. Sadly you're stuck using Vista. KOffice is Linux only right now (although there is a CygWin port being worked on). You could always upgrade to Linux and get rid of Vista :-)

There is a plan to have PDF import in OpenOffice.org as an extension: http://wiki.services.openoffice.org/wik ... _Extension
openSUSE 11.4, KDE4.6 with OpenOffice.org 3.3
Post Reply