[Solved] PDF to text conversion

Talk about anything at all....

[Solved] PDF to text conversion

Postby RosaliyLynne » Tue Feb 26, 2019 10:48 pm

does Open Office have a way to convert a PDF file back into either text or open office format?
Last edited by Hagar Delest on Thu Feb 28, 2019 9:15 am, edited 1 time in total.
Reason: tagged solved
open office 3.3 / windows vista, windows xp and windows 7
RosaliyLynne
 
Posts: 2
Joined: Tue Jul 23, 2013 8:15 pm

Re: pdf to text conversion

Postby RoryOF » Tue Feb 26, 2019 10:55 pm

No. Frequently one can select text in a PDF and Copy/Paste it into an .odt file; if it is a very large file, it is often better to pass the PDF through an OCR (Optical Character Recognition) application, which will translate the PDF into editable text. Some or much reformatting and correction may be required.
Apache OpenOffice 4.1.6 on Xubuntu 18.04.2 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 28563
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: pdf to text conversion

Postby John_Ha » Wed Feb 27, 2019 12:26 am

See [Tutorial] How do I view or edit a PDF file with OpenOffice?

You will find much useful information in the User Guides, the Writer, Base and Calc Tutorials and the AOO Frequently Asked Questions. May I suggest you bookmark the pages.
AOO 4.1.6, Windows 7 Home 64 bit

See the Writer Manual, the Writer FAQ, the Writer Tutorials and the Writer guide.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
 
Posts: 6572
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: pdf to text conversion

Postby Zizi64 » Wed Feb 27, 2019 9:05 am

The LibreOffice can save into "hybrid PDF format". The result file contains the ODF an PDF version of the file in same time. You can view and print it with a PDF reader software, but you can edit it with the LibreOffice.

I know it: it is not a solution for an existing PDF file, that contain the text as a picture or as text labels... Note:
the original PDF fpormat was not developed for re-editing.
Tibor Kovacs, Hungary; LO4.4.7, LO6.1.5 on Win7-10 x64Prof.
PortableApps, winPenPack: LO3.3.0-6.2.2; AOO4.1.5
Please, edit the initial post in the topic: add the word [Solved] at the beginning of the subject line - if your problem has been solved.
User avatar
Zizi64
Volunteer
 
Posts: 7825
Joined: Wed May 26, 2010 7:55 am
Location: Budapest, Hungary

Re: pdf to text conversion

Postby John_Ha » Wed Feb 27, 2019 11:49 am

Zizi64 wrote:The LibreOffice can save into "hybrid PDF format".

So can AOO.

Zizi64 wrote:Note: the original PDF format was not developed for re-editing.

PDF stands for Portable Document Format and was designed by Adobe for ease of reading on any system.

See the tutorial. PDF files can easily be fully edited with Adobe Acrobat. I think that the PDF format must be protected by patents or similar because, while many applications can write PDF files, very few exist which can edit PDF files.

 Edit: This may soon change - see https://en.wikipedia.org/wiki/PDF which says:

Adobe Systems made the PDF specification available free of charge in 1993. In the early years PDF was popular mainly in desktop publishing workflows, and competed with a variety of formats such as DjVu, Envoy, Common Ground Digital Paper, Farallon Replica and even Adobe's own PostScript format.

PDF was a proprietary format controlled by Adobe until it was released as an open standard on July 1, 2008, and published by the International Organization for Standardization as ISO 32000-1:2008, at which time control of the specification passed to an ISO Committee of volunteer industry experts. In 2008, Adobe published a Public Patent License to ISO 32000-1 granting royalty-free rights for all patents owned by Adobe that are necessary to make, use, sell, and distribute PDF compliant implementations.

PDF 1.7, the sixth edition of the PDF specification that became ISO 32000-1, includes some proprietary technologies defined only by Adobe, such as Adobe XML Forms Architecture (XFA) and JavaScript extension for Acrobat, which are referenced by ISO 32000-1 as normative and indispensable for the full implementation of the ISO 32000-1 specification. These proprietary technologies are not standardized and their specification is published only on Adobe’s website. Many of them are also not supported by popular third-party implementations of PDF.

On July 28, 2017, ISO 32000-2:2017 (PDF 2.0) was published. ISO 32000-2 does not include any proprietary technologies as normative references.
 
AOO 4.1.6, Windows 7 Home 64 bit

See the Writer Manual, the Writer FAQ, the Writer Tutorials and the Writer guide.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
 
Posts: 6572
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: pdf to text conversion

Postby RosaliyLynne » Thu Feb 28, 2019 2:30 am

thank you all for your responses. Adobe actually has such a program but it requires a subscription service and the program was taking so long to install that I contacted Adobe and cancelled that attempt. I ended up manually inputting the 2-page document (a resume - for a friend) and saving it in Office format before converting to pdf. This way it can be edited again in future. The original non-prf was on her work computer BUT eliminated her position and she did not back it up to a flash drive. All is well though. Again - thank you all.
open office 3.3 / windows vista, windows xp and windows 7
RosaliyLynne
 
Posts: 2
Joined: Tue Jul 23, 2013 8:15 pm


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest