Convert PDF file to text document
Convert PDF file to text document
Ye Olde Luddite here. Is there a way to "convert" a PDF file into Open Office? Someone has sent me a long PDF document which he wants me to edit. Thank you.
Windows 8 - Open Office 4.0.1
Re: PDF file
Ask if they have an original file, be it .odt, .txt, .doc, .docx or other non PDF format. Using OpenOffice or LibreOffice with some PDF opening application is not the best way to edit a PDF file. In any case, no matter how you convert the file to edit in OpenOffice, you are likely to have much reformatting to do. Getting an original text file will preserve much of the formatting.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
- LastUnicorn
- Posts: 812
- Joined: Sat Mar 29, 2008 2:41 am
- Location: Scotland
Re: PDF file
As far as I am aware this can't be done as Normal PDF files are non-editable. So you would have to select all the text in the PDF in a PDF management application and paste the selected text into OpenOffice though this assumes the PDF is not a scan of a hard-copy document (a book for example). Having done that you would then need to deal with tidying the pasted text up in OpenOffice. This would be a manual process but you might find find/replace can speed that up by a long margin.
Another way, not open to you I would think, would have been that if the PDF file had been created via LibreOffice then it could have been exported/created as a Hybrid PDF file. What that does is it incorporates, as a buried element (if I can put it that way) in the PDF file the text of a normal LibreOffice .odt file. So, if ever, it was desired to edit the PDF then the PDF could be loaded into LibreOffice Writer, editing done on the buried Writer text (inside the PDF). Once the editing is done then just Export/Create the Hybrid PDF file anew and all changes in the Writer text will be included in the new Hybrid PDF. You would though be well advised to save the Writer file as a normal .odt file as well so you always have an alternative route to creating a new PDF.
Perhaps you could mention LibreOffice and Hybrid PDFs to the author of the PDF file you are having to deal with.
If you want to test this out yourself you might find this useful: There are several good reasons for making the switch anyway, some of which are mentioned here: [Tutorial] Considering a Switch from OpenOffice to LibreOffice? Some Useful Information
Another way, not open to you I would think, would have been that if the PDF file had been created via LibreOffice then it could have been exported/created as a Hybrid PDF file. What that does is it incorporates, as a buried element (if I can put it that way) in the PDF file the text of a normal LibreOffice .odt file. So, if ever, it was desired to edit the PDF then the PDF could be loaded into LibreOffice Writer, editing done on the buried Writer text (inside the PDF). Once the editing is done then just Export/Create the Hybrid PDF file anew and all changes in the Writer text will be included in the new Hybrid PDF. You would though be well advised to save the Writer file as a normal .odt file as well so you always have an alternative route to creating a new PDF.
Perhaps you could mention LibreOffice and Hybrid PDFs to the author of the PDF file you are having to deal with.
If you want to test this out yourself you might find this useful: There are several good reasons for making the switch anyway, some of which are mentioned here: [Tutorial] Considering a Switch from OpenOffice to LibreOffice? Some Useful Information
Last edited by LastUnicorn on Fri Feb 07, 2025 7:28 pm, edited 1 time in total.
LibreOffice 25.8.4.2 (x64) installed to Windows 11 Pro. 25H2
Apache OpenOffice Portable 4.1.16 [Portable Apps]
For Java I use Adoptium Temurin JRE LTS Releases.
Apache OpenOffice Portable 4.1.16 [Portable Apps]
For Java I use Adoptium Temurin JRE LTS Releases.
Re: PDF file
As LastUnicorn says, there are other ways, but the simplest is to ask first if there is a text file. If not, we can go into those ways.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
Re: PDF file
LibreOffice may open it - as a Draw file - but not OO.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
Re: Convert PDF file to text document
Using
https://www.zamzar.com/converters/document/pdf/
one can convert PDF to ODT format.
There will almost certainly be some reformatting required.
https://www.zamzar.com/converters/document/pdf/
one can convert PDF to ODT format.
There will almost certainly be some reformatting required.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
Re: PDF file
For AOO, we need to install an OXT
https://extensions.openoffice.org/en/pr ... openoffice
LibO simply embeds it
Co-admin french forum branch
Re: Convert PDF file to text document
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
Re: Convert PDF file to text document
Original code from Ariel Constenla-Haile
Render unto Caesar that which is Caesar's
Render unto Caesar that which is Caesar's
Co-admin french forum branch
- Hagar Delest
- Moderator
- Posts: 33633
- Joined: Sun Oct 07, 2007 9:07 pm
- Location: France
Re: Convert PDF file to text document
The point with hybrid PDF is that it's a not a very known feature. And the author has to think about that possibility, which is quite unlikely in most cases. Especially for such request (as OP's). If a PDF has been sent for edition, it is very likely the source file is not available anymore.
LibreOffice 25.2 on Linux Mint Debian Edition (LMDE 7 Gigi) and 25.2 portable on Windows 11.
Re: Convert PDF file to text document
PDF is originally an output format (viewing or printing).Hagar Delest wrote: ↑Sun Feb 16, 2025 8:22 pm If a PDF has been sent for edition, it is very likely the source file is not available anymore.
Cross-posted with same issue: viewtopic.php?p=551917#p551917
Co-admin french forum branch
- Hagar Delest
- Moderator
- Posts: 33633
- Joined: Sun Oct 07, 2007 9:07 pm
- Location: France
Re: Convert PDF file to text document
So what?
OP issue is that someone sent him a PDF to be edited. If the sender had the source file, then there would be no issue at all. Since PDF is made for output indeed, then the question is how to kind of reverse engineer the document.
I would not call that cross-posting (except if you are talking about your posts) since the topics are not from the same user. This question is quite frequent in the forum.
OP issue is that someone sent him a PDF to be edited. If the sender had the source file, then there would be no issue at all. Since PDF is made for output indeed, then the question is how to kind of reverse engineer the document.
I would not call that cross-posting (except if you are talking about your posts) since the topics are not from the same user. This question is quite frequent in the forum.
LibreOffice 25.2 on Linux Mint Debian Edition (LMDE 7 Gigi) and 25.2 portable on Windows 11.