Page 1 of 1

[Dropped] Convert PDF to ODS

Posted: Fri Nov 22, 2024 7:37 pm
by dbuster
I have received a PDF file that was derived from either a database or spreadsheet file. I would like to turn it into an Open Office spreadsheet. Is there a way to do this? Thanks.

Re: Convert PDF to ODS

Posted: Fri Nov 22, 2024 10:37 pm
by Hagar Delest
Hi and welcome to the forum!

Try a mere copy and paste, depending on how the PDF was made, the table structure may have been kept.
Printing the PDF as an image and then using an OCR application may give you some result.
Note: you can try with LibreOffice (at least a portable version), it may do better for the copy paste.
But that's much trouble, redoing the whole file may be quicker. Especially if copy-paste at least gives you series as rows or columns.

Re: Convert PDF to ODS

Posted: Mon Nov 25, 2024 10:52 pm
by Jan_J
Too many detailed problems arise to believe that PDF transformation produces reliable data.
One of them is custom font encoding. There is no evidence that characters in PDF document correspond to usual entry points of usual text encoding; let's say, unicode.
One of others is possibility of specifying translated location of objects.
However, in *simple* situations, extracting text from PDF, and/or Copy→Paste operations may go smooth.

Re: Convert PDF to ODS

Posted: Mon Nov 25, 2024 10:59 pm
by RoryOF
There is a linux utility pdftotext that may be of use, but be aware it does not always (for reasons I do not know) produce strictly linear text output. In the case of PDF book text, one can often find displaced chunks of text in the output file.