Convert PDF file to text document

Discuss the word processor
Locked
Larry1700
Posts: 46
Joined: Wed Feb 26, 2014 10:30 pm

Convert PDF file to text document

Post by Larry1700 »

Ye Olde Luddite here. Is there a way to "convert" a PDF file into Open Office? Someone has sent me a long PDF document which he wants me to edit. Thank you.
Windows 8 - Open Office 4.0.1
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: PDF file

Post by RoryOF »

Ask if they have an original file, be it .odt, .txt, .doc, .docx or other non PDF format. Using OpenOffice or LibreOffice with some PDF opening application is not the best way to edit a PDF file. In any case, no matter how you convert the file to edit in OpenOffice, you are likely to have much reformatting to do. Getting an original text file will preserve much of the formatting.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
Larry1700
Posts: 46
Joined: Wed Feb 26, 2014 10:30 pm

Re: PDF file

Post by Larry1700 »

OK, thank you very much.
Windows 8 - Open Office 4.0.1
User avatar
LastUnicorn
Posts: 812
Joined: Sat Mar 29, 2008 2:41 am
Location: Scotland

Re: PDF file

Post by LastUnicorn »

As far as I am aware this can't be done as Normal PDF files are non-editable. So you would have to select all the text in the PDF in a PDF management application and paste the selected text into OpenOffice though this assumes the PDF is not a scan of a hard-copy document (a book for example). Having done that you would then need to deal with tidying the pasted text up in OpenOffice. This would be a manual process but you might find find/replace can speed that up by a long margin.

Another way, not open to you I would think, would have been that if the PDF file had been created via LibreOffice then it could have been exported/created as a Hybrid PDF file. What that does is it incorporates, as a buried element (if I can put it that way) in the PDF file the text of a normal LibreOffice .odt file. So, if ever, it was desired to edit the PDF then the PDF could be loaded into LibreOffice Writer, editing done on the buried Writer text (inside the PDF). Once the editing is done then just Export/Create the Hybrid PDF file anew and all changes in the Writer text will be included in the new Hybrid PDF. You would though be well advised to save the Writer file as a normal .odt file as well so you always have an alternative route to creating a new PDF.

Perhaps you could mention LibreOffice and Hybrid PDFs to the author of the PDF file you are having to deal with.

If you want to test this out yourself you might find this useful: There are several good reasons for making the switch anyway, some of which are mentioned here: [Tutorial] Considering a Switch from OpenOffice to LibreOffice? Some Useful Information
Last edited by LastUnicorn on Fri Feb 07, 2025 7:28 pm, edited 1 time in total.
LibreOffice 25.8.4.2 (x64) installed to Windows 11 Pro. 25H2
Apache OpenOffice Portable 4.1.16 [Portable Apps]
For Java I use Adoptium Temurin JRE LTS Releases.
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: PDF file

Post by RoryOF »

As LastUnicorn says, there are other ways, but the simplest is to ask first if there is a text file. If not, we can go into those ways.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
Larry1700
Posts: 46
Joined: Wed Feb 26, 2014 10:30 pm

Re: PDF file

Post by Larry1700 »

thank you to everyone who responded -
Windows 8 - Open Office 4.0.1
JeJe
Volunteer
Posts: 3132
Joined: Wed Mar 09, 2016 2:40 pm

Re: PDF file

Post by JeJe »

LibreOffice may open it - as a Draw file - but not OO.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Convert PDF file to text document

Post by RoryOF »

Using
https://www.zamzar.com/converters/document/pdf/
one can convert PDF to ODT format.

There will almost certainly be some reformatting required.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
Larry1700
Posts: 46
Joined: Wed Feb 26, 2014 10:30 pm

Re: Convert PDF file to text document

Post by Larry1700 »

thank you for this information
Windows 8 - Open Office 4.0.1
Bidouille
Volunteer
Posts: 681
Joined: Mon Nov 19, 2007 10:58 am
Location: France

Re: PDF file

Post by Bidouille »

JeJe wrote: Fri Feb 07, 2025 8:33 pm LibreOffice may open it - as a Draw file - but not OO.
For AOO, we need to install an OXT
https://extensions.openoffice.org/en/pr ... openoffice
LibO simply embeds it
JeJe
Volunteer
Posts: 3132
Joined: Wed Mar 09, 2016 2:40 pm

Re: Convert PDF file to text document

Post by JeJe »

Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
Bidouille
Volunteer
Posts: 681
Joined: Mon Nov 19, 2007 10:58 am
Location: France

Re: Convert PDF file to text document

Post by Bidouille »

Original code from Ariel Constenla-Haile

Render unto Caesar that which is Caesar's
User avatar
Hagar Delest
Moderator
Posts: 33633
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: Convert PDF file to text document

Post by Hagar Delest »

The point with hybrid PDF is that it's a not a very known feature. And the author has to think about that possibility, which is quite unlikely in most cases. Especially for such request (as OP's). If a PDF has been sent for edition, it is very likely the source file is not available anymore.
LibreOffice 25.2 on Linux Mint Debian Edition (LMDE 7 Gigi) and 25.2 portable on Windows 11.
Bidouille
Volunteer
Posts: 681
Joined: Mon Nov 19, 2007 10:58 am
Location: France

Re: Convert PDF file to text document

Post by Bidouille »

Hagar Delest wrote: Sun Feb 16, 2025 8:22 pm If a PDF has been sent for edition, it is very likely the source file is not available anymore.
PDF is originally an output format (viewing or printing).
Cross-posted with same issue: viewtopic.php?p=551917#p551917
User avatar
Hagar Delest
Moderator
Posts: 33633
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: Convert PDF file to text document

Post by Hagar Delest »

So what?
OP issue is that someone sent him a PDF to be edited. If the sender had the source file, then there would be no issue at all. Since PDF is made for output indeed, then the question is how to kind of reverse engineer the document.
I would not call that cross-posting (except if you are talking about your posts) since the topics are not from the same user. This question is quite frequent in the forum.
LibreOffice 25.2 on Linux Mint Debian Edition (LMDE 7 Gigi) and 25.2 portable on Windows 11.
Locked