[Dropped] Convert PDF to ODS

dbuster · Post by **dbuster** » Fri Nov 22, 2024 7:37 pm

I have received a PDF file that was derived from either a database or spreadsheet file. I would like to turn it into an Open Office spreadsheet. Is there a way to do this? Thanks.

Post by **Hagar Delest** » Fri Nov 22, 2024 10:37 pm

Hi and welcome to the forum!

Try a mere copy and paste, depending on how the PDF was made, the table structure may have been kept.
Printing the PDF as an image and then using an OCR application may give you some result.
Note: you can try with LibreOffice (at least a portable version), it may do better for the copy paste.
But that's much trouble, redoing the whole file may be quicker. Especially if copy-paste at least gives you series as rows or columns.

Jan_J · Post by **Jan_J** » Mon Nov 25, 2024 10:52 pm

Too many detailed problems arise to believe that PDF transformation produces reliable data.
One of them is custom font encoding. There is no evidence that characters in PDF document correspond to usual entry points of usual text encoding; let's say, unicode.
One of others is possibility of specifying translated location of objects.
However, in *simple* situations, extracting text from PDF, and/or Copy→Paste operations may go smooth.

Post by **RoryOF** » Mon Nov 25, 2024 10:59 pm

There is a linux utility pdftotext that may be of use, but be aware it does not always (for reasons I do not know) produce strictly linear text output. In the case of PDF book text, one can often find displaced chunks of text in the output file.

[Dropped] Convert PDF to ODS

[Dropped] Convert PDF to ODS

Re: Convert PDF to ODS

Re: Convert PDF to ODS

Re: Convert PDF to ODS