[Solved] AOO API to convert Office formats to image

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
beakhole
Posts: 8
Joined: Thu Feb 07, 2019 10:43 pm

[Solved] AOO API to convert Office formats to image

Post by beakhole »

Hello,

I'm attempting to write a command line application that will allow me to convert most popular office formats to both PDF and JPG (or any image format). I found an awesome little project called PDFVert (https://github.com/dmazz55/Pdfvert) which uses the OO API to convert popular formats to PDF, and I have downloaded and tested that and it works wonderfully. This is the first time I will be using the API, but looking at the source code for PDFVert answers a lot of questions.

From what I can tell, I need to find the Open Office filter names for different formats in order to tell the application to tell OO what format to save/convert the file as/to. For example, if PDFVert is converting a .doc file it uses the Open Office filter "writer_pdf_Export" to tell the Open Office API to use Writer and export as PDF.

Is there a filter I can use for resaving as an image? I understand image saving/exporting may not be included in OO by default, but there are some extensions that could help like this one (https://extensions.openoffice.org/en/pr ... s#releases). If I install an extension like that, is there any way for me to utilize it via the Open Office API in my app, or is it only accessible when using the Open Office applications normally, not through the API?

To increase the number of formats I can convert to an image file, I was also thinking I could use PDF as an intermediary format since thanks to PDFVert I already can convert most formats to PDF right now. This extension (https://extensions.openoffice.org/en/pr ... openoffice) enables Open Office to import PDF's, so it could be possible to use it to import a PDF and then use the first extension to save as an image, but again I would need some way to interface with this extension via the Open Office API, is that possible? Is there a filter name these extensions adds?

I found some more filters at this project (https://gist.github.com/psd/268550/faca ... 517c3e2167), but they are for resaving as office formats, not images.

Any tips to point me in the right direction would be amazing.

Edit, found a larger filter list here (https://stackoverflow.com/questions/117 ... le-formats) haven't tried any yet
Last edited by Hagar Delest on Wed Feb 13, 2019 9:11 am, edited 3 times in total.
Reason: tagged solved
OpenOffice 4 on Windows 10
JeJe
Volunteer
Posts: 2764
Joined: Wed Mar 09, 2016 2:40 pm

Re: Using Open Office API to convert Office formats to image

Post by JeJe »

Have you seen this thread?

viewtopic.php?f=20&t=88245
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
beakhole
Posts: 8
Joined: Thu Feb 07, 2019 10:43 pm

Re: Using Open Office API to convert Office formats to image

Post by beakhole »

I did see that however when I skimmed it the first time I saw someone suggesting to use a 3rd party converter like Zamzar so I continued to look for other threads instead. But I looked it over again more closely and the included macro to print filters was very useful. I ran it in Writer and got this list (haven't tried any of these yet but will be doing so soon):

Code: Select all

writer_pdf_Export
draw_PCD_Photo_CD_Base4
Rich Text Format
MS Excel 97
writer_web_HTML_help
XHTML Writer File
Text (StarWriter/Web)
MS PowerPoint 97 Vorlage
CGM - Computer Graphics Metafile
writer_web_StarOffice_XML_Writer_Web_Template
dBase
StarOffice XML (Calc)
impress_svg_Export
WMF - MS Windows Metafile
SVG - Scalable Vector Graphics
MS Word 2003 XML
MS Excel 95 Vorlage/Template
impress_html_Export
draw_bmp_Export
Text - txt - csv (StarCalc)
MS Word 95 Vorlage
Calc MS Excel 2007 XML
impress_emf_Export
DocBook File
draw_ppm_Export
MS Word 97 Vorlage
MS Word 2007 XML
UOF text
PCT - Mac Pict
MS PowerPoint 97
Calc MS Excel 2007 Binary
writerweb8_writer
XPM
StarOffice XML (Base)
Quattro Pro 6.0
PCX - Zsoft Paintbrush
HTML
draw_pdf_addstream_import
DIF
impress_gif_Export
impress8_template
calc_pdf_addstream_import
MS Word 2007 XML Template
MS WinWord 5
impress_tif_Export
MS Excel 97 Vorlage/Template
impress_flash_Export
draw_svg_Export
UOF spreadsheet
impress_jpg_Export
Impress MS PowerPoint 2007 XML
impress_svm_Export
draw_PCD_Photo_CD_Base
StarOffice XML (Impress)
Text (encoded) (StarWriter/Web)
impress_pgm_Export
impress_eps_Export
draw_emf_Export
UOF presentation
writerglobal8
math8
impress_pdf_Export
writerglobal8_HTML
chart8
writer_pdf_import
impress_pdf_addstream_import
RAS - Sun Rasterfile
GIF - Graphics Interchange
draw_gif_Export
draw8
MS Excel 95 (StarWriter)
draw_tif_Export
MS Excel 2003 XML
writer_web_pdf_Export
draw_jpg_Export
HTML (StarCalc)
impress_met_Export
draw_svm_Export
TIF - Tag Image File
draw_StarOffice_XML_Draw_Template
PSD - Adobe Photoshop
NSO Calc UOF2
MET - OS/2 Metafile
MS Excel 5.0/95
MS Excel 5.0/95 Vorlage/Template
draw_pgm_Export
draw_eps_Export
DXF - AutoCAD Interchange
writer_globaldocument_StarOffice_XML_Writer_GlobalDocument
EPS - Encapsulated PostScript
impress_wmf_Export
draw_pdf_Export
SYLK
SGF - StarOffice Writer SGF
impress_StarOffice_XML_Impress_Template
NSO Writer UOF2
draw_flash_Export
impress_xpm_Export
XHTML Draw File
SVM - StarView Metafile
writer_web_StarOffice_XML_Writer
impress_ras_Export
draw_html_Export
draw8_template
XHTML Calc File
T602Document
PBM - Portable Bitmap
BMP - MS Windows
Lotus 1-2-3 1.0 (WIN) (StarWriter)
MS Excel 4.0 Vorlage/Template
writer8_template
draw_PCD_Photo_CD_Base16
placeware_Export
MS Excel 5.0 (StarWriter)
HTML (StarWriter)
draw_met_Export
writerweb8_writer_template
impress_pct_Export
calc_pdf_Export
calc8
StarOffice XML (Chart)
MathML XML (Math)
impress8_draw
MathType 3.x
Calc MS Excel 2007 XML Template
impress_png_Export
XHTML Impress File
draw_wmf_Export
StarOffice XML (Math)
Lotus 1-2-3 1.0 (DOS) (StarWriter)
writer_globaldocument_StarOffice_XML_Writer
impress_pdf_import
impress_pbm_Export
StarOffice XML (Writer)
MS Word 95
writer8
Text
PNG - Portable Network Graphic
writerglobal8_writer
writer_StarOffice_XML_Writer_Template
draw_xpm_Export
calc_StarOffice_XML_Calc_Template
TGA - Truevision TARGA
MS Word 97
writer_pdf_addstream_import
draw_ras_Export
StarOffice XML (Draw)
SGV - StarDraw 2.0
impress_StarOffice_XML_Draw
calc_HTML_WebQuery
EMF - MS Windows Metafile
Text (encoded) (StarWriter/GlobalDocument)
Rich Text Format (StarCalc)
PGM - Portable Graymap
writer_globaldocument_pdf_Export
Text (encoded)
impress8
MS Excel 4.0
draw_pct_Export
MS WinWord 6.0
Lotus
draw_png_Export
XBM - X-Consortium
PPM - Portable Pixelmap
JPG - JPEG
NSO Impress UOF2
impress_bmp_Export
draw_pdf_import
draw_pbm_Export
calc8_template
MS Excel 4.0 (StarWriter)
impress_ppm_Export
MS Excel 95
math_pdf_Export
Impress MS PowerPoint 2007 XML Template
OpenOffice 4 on Windows 10
beakhole
Posts: 8
Joined: Thu Feb 07, 2019 10:43 pm

Re: Using Open Office API to convert Office formats to image

Post by beakhole »

Using the filter "impress_jpg_Export" I was able to save a PPT out as a JPG! It only saved the first slide, so for presentation types I will need to find a way to find the number of slides and save them out separately. It was also quite low quality so I'll need to figure out what arguments are accepted to configure that, however it works!

Thanks for pointing me towards that thread
OpenOffice 4 on Windows 10
beakhole
Posts: 8
Joined: Thu Feb 07, 2019 10:43 pm

Re: [Solved] AOO API to convert Office formats to image

Post by beakhole »

So as far as saving individual slides out as individual jpgs in case anyone else runs into similar problems that I did who is also new to all of this (code is C#, and there is probably better ways to do this, but this worked for me):

Online, I was unable to find an acceptable solution. I did find a workaround someone did here (https://bz.apache.org/ooo/show_bug.cgi?id=85875), but in order to do it, he saved out the first slide, then closed/reopened office, deleted the first slide from the input file, and then saved the new first slide again (which was now technically the second slide). This continues until all slides are saved out as JPG's and does work, but it adds at least 1 full second of additional time PER PAGE, so it was not going to work for me.

I was able to figure out a slightly better way to do it which is still relatively quick, though. First thing I tried was to add a new Property called "PageRange" to feed into storeToURL (kind of like this:)

Code: Select all

new PropertyValue { Name = "PageRange", Value = new Any(((XDrawPagesSupplier)xComponent).getDrawPages().getCount()) }
Unfortunately, though, that didn't have any affect no matter what values I tried. I may have done something wrong, but I couldn't get this to change the output one way or the other.
So instead I ended up doing something similar to the link above, but instead of loading OO for every single page, you can just loop through the pages, save the first one, then remove the first page, and then save again (similar to above link but without opening OO every time). To get the pages, I used:

Code: Select all

XDrawPages pages = ((XDrawPagesSupplier)xComponent).getDrawPages();
Then you can get the count for your loop

Code: Select all

pages.getCount()
Then, you can save out the first page using storeToURL()

Code: Select all

((XStorable)xComponent).storeToURL(destFile, propertyValues);
and finally simply remove the first page and continue the loop

Code: Select all

pages.remove((XDrawPage)pages.getByIndex(0).Value);
That also does not affect the input file (in case that matters to you). This nicely and quickly loops through and saved every page as a new jpg, and since you are in a loop you can append the file name each time with a new page number however you like so it saves them out as "page_1.jpg, page_2.jpg, ..." etc.

Now there do not appear to be any filters for Writer to get a JPG, so any tips on that would be cool.
OpenOffice 4 on Windows 10
Post Reply