Page 1 of 1

Batch convert PDF to Writer

Posted: Tue Dec 19, 2023 9:34 pm
by johnbmatz
How can I batch convert Adobe files to Open Office 4.1

 Edit: Changed subject, was Batch converting Adobe files to Open Office 
Make your post understandable by others 
-- MrProgrammer, forum moderator 

Re: Batch converting Adobe files to Open Office

Posted: Tue Dec 19, 2023 10:02 pm
by RoryOF
What type of Adobe files?

Re: Batch converting Adobe files to Open Office

Posted: Tue Dec 19, 2023 10:26 pm
by johnbmatz
I think they are in Adobe acrobat

Re: Batch converting Adobe files to Open Office

Posted: Tue Dec 19, 2023 10:58 pm
by johnbmatz
Just ordinary PDF files

Re: Batch converting Adobe files to Open Office

Posted: Wed Dec 20, 2023 12:26 am
by Hagar Delest
AOO has limited capacity for that. Don't remember if there is still a need of an extension. LibreOffice has included this kind of extension and can import documents but page by page IIRC. And the result may be weird because it won't recognize paragraphs, just bunches of text, especially if there are objects in the pages like pictures, captions and so on.
You may be quicker to redo the document by copy-paste.

Re: Batch converting Adobe files to Open Office

Posted: Wed Dec 20, 2023 10:27 am
by RoryOF
I normally put PDF files through an OCR application (usually gimagereader driving Tesseract, running on linux Xubuntu) and reformat completely, but such PDFs are in my case plain text without illustrations or tables.

There is at least one Windows application that will attempt to preserve the original format, but I've forgotten its name as I don't use Windows. An OCR application that produces hOCR output may give a reasonable XML coded output that preserves the layout. I have no experience with hOCR output from PDF.

Re: Batch converting Adobe files to Open Office

Posted: Wed Dec 20, 2023 11:49 am
by RoryOF
I found this online:
There is a means to convert PDF files to Word files. You can then save them as ODT, though if there is special formatting in Word, the conversion may not be exact.

Foxit has a PDF Editor, as well as a free Reader version. The Editor is often available for a free trial period after you download the free Reader. A line from their web page describes the process of PDF to Word conversion:

      1.  Open the pdf file with Foxit PDF Editor, go to Convert tab>To MS office> Word or File tab>Export>To MS Office>Word>Save As, Save As window will pop up.
I do not have the Editor version, so don't know if it will also convert directly to ODT.

Re: Batch convert PDF to Writer

Posted: Tue Dec 26, 2023 5:38 pm
by MrProgrammer
johnbmatz wrote: Tue Dec 19, 2023 9:34 pm How can I batch convert Adobe files to Open Office 4.1
You can't. OpenOffice does not provide that feature.

Portable Document Format (PDF) is intended to be a final format, suitable only for viewing or printing, though it is portable and can be reliably copied to other systems for viewing or printing. Attempts to convert PDF into some other document type (text, spreadsheet, presentation, etc.) are blocked because the information necessary to do that is not present in the PDF.

If this solved your problem please go to your first post use the Edit button and add [Solved] to the start of the Subject field. Select the green checkmark icon at the same time.

Re: Batch convert PDF to Writer

Posted: Thu Jan 25, 2024 7:20 pm
by jep
Not Writer, but Draw!
Look for "PDF Import Extension for Apache OpenOffice 0.1.1" to import drawings (quite complicated files) text in PDF files etc, and adjust them as needed.
Yes, combined with macros (ooRexx) or manually, very handy!
You can create a macro that open PDF-files and save them to .odg

Re: Batch convert PDF to Writer

Posted: Thu Jan 25, 2024 7:55 pm
by RoryOF
Be aware that the PDF Import Extension is suitable only for minor cosmetic changes to PDF files, and may also only handle smaller PDF files.

Mr Programmer has published a Perl script to extract the text from PDF files that have such text embedded in them (not necessarily _all_ PDF files).

viewtopic.php?p=410366#p410366

Re: Batch convert PDF to Writer

Posted: Thu Jan 25, 2024 11:37 pm
by Lupp
See also:
"Consolidate Text" in
https://wiki.documentfoundation.org/Rel ... ess_&_Draw ,
the enhancement discussion under
https://bugs.documentfoundation.org/sho ... ?id=118370
and the much older suggestions posted to
https://ask.libreoffice.org/t/pdf-to-dr ... iter/13805

Among these suggestions was a workaround by myself which I had sketched out of a mood without an intention to use it myself. And I never tried to get something like a "batch conversion" based on that. (I only mention this old post here because there seemingly were users judging from the upvotes.)

Anyway we should see clearly that all this cannot accomplish impossible tasks. We cannot convert a pdf file into a Writer file because it does not contain lots of information which would be needed for such a process. We also cannot convert a Writer file to pdf. We only can export the Writer thing to a file capable of telling a printer what should be output on paper. That's what essentially pdf is made for. If you want to get really editable pdf files you need to use a pdf editor (like Acrobat), and to accept the shortcomings of this proceeding.

In short: we can convert water to ice and back. We can not convert iron to gold. And a printer doesn't "convert" a pdf file to printed paper. It just prints. And what you may get from the print using "OCR" isn't a converted file.

Also: Don't wait for AOO to implement a feature like "Consolidate Text".