[Dropped] Convert PDF to ODS

Discuss the spreadsheet application
Locked
dbuster
Posts: 1
Joined: Fri Nov 22, 2024 7:33 pm

[Dropped] Convert PDF to ODS

Post by dbuster »

I have received a PDF file that was derived from either a database or spreadsheet file. I would like to turn it into an Open Office spreadsheet. Is there a way to do this? Thanks.
Last edited by MrProgrammer on Tue Dec 10, 2024 6:37 pm, edited 1 time in total.
Reason: Dropped: Suggestions provided but no response from dbuster
dbuster, Open Office 4.1.14, Windows 11
User avatar
Hagar Delest
Moderator
Posts: 33629
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: Convert PDF to ODS

Post by Hagar Delest »

Hi and welcome to the forum!

Try a mere copy and paste, depending on how the PDF was made, the table structure may have been kept.
Printing the PDF as an image and then using an OCR application may give you some result.
Note: you can try with LibreOffice (at least a portable version), it may do better for the copy paste.
But that's much trouble, redoing the whole file may be quicker. Especially if copy-paste at least gives you series as rows or columns.
LibreOffice 25.2 on Linux Mint Debian Edition (LMDE 7 Gigi) and 25.2 portable on Windows 11.
Jan_J
Posts: 195
Joined: Wed Apr 29, 2009 1:42 pm
Location: Poland

Re: Convert PDF to ODS

Post by Jan_J »

Too many detailed problems arise to believe that PDF transformation produces reliable data.
One of them is custom font encoding. There is no evidence that characters in PDF document correspond to usual entry points of usual text encoding; let's say, unicode.
One of others is possibility of specifying translated location of objects.
However, in *simple* situations, extracting text from PDF, and/or Copy→Paste operations may go smooth.
JJ ∙ https://forum.openoffice.org/pl/
LO (26.2) ∙ Python (3.13|3.10) ∙ Unicode 17 ∙ LᴬTEX 2ε ∙ XML ∙ Unix tools ∙ Linux (Rocky|CentOS)
User avatar
RoryOF
Moderator
Posts: 35210
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Convert PDF to ODS

Post by RoryOF »

There is a linux utility pdftotext that may be of use, but be aware it does not always (for reasons I do not know) produce strictly linear text output. In the case of PDF book text, one can often find displaced chunks of text in the output file.
Apache OpenOffice 4.1.16 on Xubuntu 24.04.4 LTS
Locked