Using COM to integrate Powerpoint and Impress

Shared Libraries
Forum rules
For sharing working examples of macros / scripts. These can be in any script language supported by OpenOffice.org [Basic, Python, Netbean] or as source code files in Java or C# even - but requires the actual source code listing. This section is not for asking questions about writing your own macros.
Post Reply
TerryE
Volunteer
Posts: 1401
Joined: Sat Oct 06, 2007 10:13 pm
Location: UK

Using COM to integrate Powerpoint and Impress

Post by TerryE »

The trigger for this code example is the post Broken Hyperlinks where one of our users had a problem with migrating PowerPoint documents to Impress. The specific issue is a known one () and is that both the ODF add-n for MSOffice and the PowerPoint Importer have problems parsing action Hyperlinks associated with shapes on a slide; specifically when the Hyperlink is a local file reference rather than a Web URL. Normal embedded text Hyperlinks work fine; this only occurs wth the ones associated with clicking a button or a text box, for example.

My first step in this was to do a "binary chop" on the problem to determine the exact failure and to find a work around. This workaround was to go through all slides in PowerPoint and if any shape had such a hyperlink (for example ..\examples\Presentation3.txt) to manually convert this to a dummy URL format (e.g. in this case http://dummytodenotefile/../examples/Presentation3.txt) which the PPT importer would then recognise. Such a modified PPT could then be correctly imported into Impress, so that you could then do the equivalent enumeration to find all links of the form http://dummytodenotefile/... and convert them back.

As a proof of principle, this process worked fine and did the conversion. However, it is clearly a manually intensive and error prone process which would not be usable in practice and so needed automation. So how? Since I know both VBA and OOoBasic together with the two APIs for working with presentations, I could write macros in both PowerPoint and OOo to do the automation, but is there a neater way? The ideal approach would be to have a single integrated harness which could traverse a folder hierarchy converting all PPTs within it in a single automated step. So where to put this driving routine? My solution relied on the fact that whilst OOoBasic employs the UNO API, embedded with OOo is an Automation Bridge which allows you to call Microsoft COM based applications seamlessly from OOoBasic. Hence the approach that I decided to follow was:
Structure Schematic for PPT to ODP conversion
Structure Schematic for PPT to ODP conversion
PPTtoODPdiagram.png (6.94 KiB) Viewed 8193 times
with the controlling Macro written in OOoBasic. I also decided to place this macro is a document rather than a libraryfor the following reasons: (i) This avoids the need to introduce the user into the complications of OOo library management; (ii) Since the conversion process is a batch walk of a folder hierarchy which could hit 100s of PPT files, I needed to produce a log file of the conversion; the easiest way to do this is to use a spreadsheet for the conversion and embed the macro into this spreadsheet.

Developing the PPT Patch Routine

The way that I did this was the same approach that I've evolved for writing VBscripts, which is first to develop the routine in the native VBA engine, in this case in PowerPoint. This has the advantage that you can use the VBA IDE to walk the COM structures and debug the utility. Once the function is working in VBA, it can then be easily ported to OOoBasic. There were three specific wrinkles with this:
  • VBA has default properties and access methods which are not supported through the UNO Automation Bridge. An example here is that you can say oSlide.Shapes(i) in VBA which defaults to the Item access method. In OOoBasic you have to be explicit: oSlide.Shapes.Item(i)
  • VBA is hard typed so you can define object by type and this is enforced both at compile and runtime. In practice OOo Basic is not hard typed, so you need to remove all type declarations. (Yes, OOoBasic supports basic typing such as As String, but if you benchmark the interpreter, you will see that simple Variants execute faster!
  • The last point isn't so much a VBA issue but one of aligning the data models for access to hyperlinks. There are multiple access paths to the hyperlinks in document. I had to ensure that the paths for the PowerPoint and Impress object models were aligned:
    • PPT: Appplication->Presentations->Presentation->Slides->Slide->Shapes->Shape->Action->Hyperlink->Address
      Impress Desktop->Documents->Document->DrawPages->Slide->Shapes->Shape->Bookmark
Which led to this code for the FixLinksInPPT routine:

Code: Select all

'
'--------------------------------- FixLinksInPPT ---------------------------------
' This routine uses COM to talk to PowerPoint to load a PPT and convert any file
' based hyperlinks to dummy http based ones.  The patched PPT is saved in the same
' directory as a new presentation with the prefix TMP$ on the file name.
'
Function FixLinksInPPT(oApp As PowerPoint.Application, sPath As String, sFileName As String) As Long

Const cPresentation = 1
Const cActionHyperlink = 7
Dim cnt, i, j, k, sFile, oPres, oSlides, oSlide, oShapes, oActions, oLink, sAddr

cnt = 0
sFile = ConvertFromURL(sPath + sFileName + sExtn)
oPres = oApp.Presentations.Open(sFile, True, False)
oSlides = oPres.Slides
For i = 1 To oSlides.Count
  oSlide = oSlides.Item(i)
  '
  ' Traversing all the shaps on a slide to find out where any have an action
  ' Hyperlink is a pretty expensive operation, so I have added check so that 
  ' I only do this if there are hyperlinks on the slide.
  '
  If oSlide.Hyperlinks.Count > 0 Then
    oShapes = oSlide.Shapes
    For j = 1 To oShapes.Count
      oActions = oShapes.Item(j).ActionSettings
      For k = 1 To oActions.Count
        If oActions.Item(k).Action = cActionHyperlink Then
          oLink = oActions.Item(k).Hyperlink
          sAddr = ConvertToURL(oLink.Address)
          If Left(sAddr, 8) = "file:///" Then
            If bDEBUG Then logRec Array("FixLinksInPPT", "old link = " + oLink.Address)
            If bDEBUG Then logRec Array("FixLinksInPPT", "old URL = " + sAddr)
            oLink.Address = sDummyPrefix & mid(sAddr,9)
            cnt = cnt + 1
            If bDEBUG Then logRec Array("FixLinksInPPT", "new URL = " + (sDummyPrefix & mid(sAddr,8)))
          End If
        End If
      Next k
    End If
  Next j
Next i

oPres.SaveAs sPath + "TMP$" + sFileName + sExtn, cPresentation
oPres.Close
FixLinksInPPT = cnt
End Function
The ODP Patch Routine

This was similar in structure, though this took somewhat longer to develop because (i) I was not familiar with the Impress Object model and (ii) the OOoBasic IDE does not have the same compile-time object model walk and pick routines as the VBA IDE. The nearest to this is incremental use of the Xray utility. But here is this code:

Code: Select all

'
'--------------------------------- FixLinksInImpress ---------------------------------
' This routine uses Impress to load a PPT and convert any dummy http 
' based hyperlinks to file based ones.  The repatched Presentation is saved in the same
' directory as a new ODP presentation without the prefix TMP$ on the file name.
'
Function FixLinksInImpress(sPath As String, sFile As String)

Dim cnt, sURL, oDoc, oSlide, nSlides, sBookmark, i, j
Dim oPV(0) As New com.sun.star.beans.PropertyValue 
oPV(0).Name = "Hidden" : oPV(0).Value = True
cnt = 0

' GlobalScope.BasicLibraries.LoadLibrary( "XrayTool" )

sUrl = sPath + "TMP$" + sFile + sExtn
oDoc = StarDesktop.loadComponentFromURL(sUrl,"_default",0,Array())
nSlides = oDoc.DrawPages.Count

For i = 0 to nSlides - 1
  oSlide = oDoc.DrawPages(i)
  For j =  0 To oSlide.Count - 1
    sBookmark = oSlide(j).Bookmark
    If LCase(Left(sBookmark, Len(sDummyPrefix))) = sDummyPrefix Then
      If bDEBUG Then logRec Array("FixLinksInImpress", "old BM = " + sBookmark)
      If bDEBUG Then logRec Array("FixLinksInImpress", "new raw BM = " + mid(sBookmark, Len(sDummyPrefix)+1))
      oSlide(j).Bookmark = ConvertFromURL(mid(sBookmark, Len(sDummyPrefix)+1))
      If bDEBUG Then logRec Array("FixLinksInImpress", "new BM = " + oSlide(j).Bookmark)
      cnt = cnt + 1
    End If
  Next j
Next i

oDoc.storeAsURL( ConvertToURL(sPath + sFile + ".odp"), Array ())    
oDoc.Close(True)
oDoc.Dispose
FixLinksInImpress = cnt
End Function
The Glue to Put it together

All that now remained is the driving framework which walks the folder hierarchy and calls these two routines for each PPT. Here is this code:

Code: Select all

Option Explicit
'
' Utility to convert PPT to ODP with workaround to fix an issue with the PPT
' importer which fails to convert file based links correctly.
'
' ** Notes **
' *  all files and directories used in traversing the folder structures are in
'    URL format.  JiT conversion is done where Native windows format is required
'
'
Dim nRow, oApp
Const sExtn = ".ppt"
Const sDummyPrefix = "http://dummytodenotefile/"
Const bDEBUG = True
'
'----------------------------- ConvertPPTtoODP -----------------------------
' This is the main processing routine which parses the folder tree from the
' chosen root.  It invokes PowerPoint and Impress to convert each PPT file
' 
Sub ConvertPPTtoODP

Dim oDialog, oFactory

GlobalScope.BasicLibraries.LoadLibrary("XrayTool")
ClearLogSheet
'
' Use the FolderPicker service to choose the Root Folder
'
oDialog = CreateUnoService("com.sun.star.ui.dialogs.FolderPicker")
oDialog.Description = "Choose root Directory of PPT tree to be converted"
If oDialog.execute() <> com.sun.star.ui.dialogs.ExecutableDialogResults.OK Then Exit Sub
'
' Establish the COM link to PowerPoint
'
oFactory = createUnoService("com.sun.star.bridge.oleautomation.Factory") 
oApp = oFactory.createInstance("PowerPoint.Application")
'
' Now traverse the chosen Folder Hierachy
'
ProcessDirTree(oDialog.Directory + "/")

End Sub
'
'------------------------------ ProcessDirectoryElement ------------------------------
' This is the worker routine called by ProcessDirTree each time it finds a file of the
' specified extension as it recurses into the specfied directory tree.  This is a 
' customised action routine to do the per file processing.  
'
' In this caae the routine calls two main subroutines one to use COM to PowerPoint to 
' fix up the links, and save a temporary PPT, and one to load this PPT into Impress and
' patch them back to the required format.  I have embedded this in a spreadsheet which
' acts as a log of the converstion. 
'  
Sub ProcessDirectoryElement(sDir, filename)

Dim nPPTcnt, nODPcnt, sFile

sFile = Left(filename, Len(filename) - Len(sExtn))
nPPTcnt = FixLinksInPPT(oApp, sDir, sFile)
nODPcnt = FixLinksInImpress(sDir, sFile)
Kill(sDir + "TMP$" + sFile + sExtn)
LogRec Array(sDir, filename,  nPPTcnt, nODPcnt)

End Sub
'
'--------------------------------- ProcessDirTree ---------------------------------
' Recurse into a directory tree applying specified function to each file of given type
' This call the routine ProcessDirectoryElement for each file match
'  
Sub ProcessDirTree(sDir)

Dim subDir(1 To 1), i, j, sDirPattern, sFileName
Const cFiles = 0, cDirs = 16

'sDirPattern = ConvertFromURL()
sFileName = Dir(sDir + "*" + sExtn, cFiles)

While sFileName <> ""
  If (GetAttr(sDir + sFileName) And cDirs) = 0 And Len(sFileName) > 3 Then
    If LCase(Right(sFileName,Len(sExtn))) = sExtn And UCase(Left(sFileName,4)) <> "TMP$" Then
      ProcessDirectoryElement(sDir, sFileName)
    End If
  End If
  sFileName = Dir()
Wend

sFileName = Dir (ConvertFromURL(sDir + "*"), cDirs)
i = 0
While sFileName <> ""
  If sFileName <> "." And sFileName <> ".." Then 
    If (GetAttr( sDir + sFileName) And cDirs) > 0 Then
      i = i + 1
      Redim Preserve subDir(1 To i)
      subDir(i) = sDir + sFileName + "/"
End If
  End If
  sFileName = Dir()
Wend

For j = 1 to i : ProcessDirTree (subDir(j), sExtn) : Next j
	
End Sub
'
'--------------------------------- ClearLogSheet ---------------------------------
' The first sheet in this document is used to log the progress of this conversion.
' This Routine sets up sheet by clearing the previous contents
'  
Sub ClearLogSheet()

Dim oSheet, oCursor

oSheet = ThisComponent.Sheets(0)
ThisComponent.CurrentController.ActiveSheet = oSheet
oCursor = oSheet.createCursor()
oCursor.gotoStartOfUsedArea(False)
oCursor.gotoEndOfUsedArea(True)
oCursor.collapseToCurrentRegion()
oCursor.clearContents(com.sun.star.sheet.CellFlags.VALUE + _
                      com.sun.star.sheet.CellFlags.STRING)
nRow = 0
End Sub
'
'--------------------------------- LogRec ---------------------------------
' This routine takes one argument a variant array of values.  It writes them
' to the next row ins the Logging sheet
'  
Sub LogRec(vaCells)
Dim nCells
nCells = UBound(vaCells)
ThisComponent.Sheets(0).getCellRangeByPosition(0,nRow,nCells,nRow).DataArray = _
  Array(vaCells)
nRow = nRow + 1
End Sub
Lastly this code is embedded in a single module within a spreadsheet. Note that I don't really need a dialogue interface since I used the standard Folder Picker to select the desired folder. I also added a GO button to invoke the driving subroutine, so all you need to do is to open the spreadsheet, click GO and pick the folder that you want to process. The routine then creates an ODP document in every subirectory corresponding to each PPT it finds. (As an aside here, but the button Action dialogue box wouldn't allow me to pick a document macro. I could only pick my library ones. so I gave up and edited the context.xml within the ODS container to select the correct macro hyperlink — yuk!!!

So it works (and I've attached the example ODS if you want to play with it). The error handling is non-existent at the moment, so the program will barf if you leave any PPTs or ODFs open that it needs. Nonetheless this also provides a functional workaround for the PPT hyperlink bug, and converts an extremely long manual activity into a fully automated one. My main purpose h0ere was to show the approach. To make this a robust solution, more work is needed: on error handling and possibly automatic link type conversion (e.g. *.ppt -> *.odp). My closing message is: Scripting + UNO and COM APIs offer you considerable opportunities to automate migration processes.
Attachments
PPTconvertorV2.ods
Complete Conversion utility
(10.49 KiB) Downloaded 620 times
Ubuntu 11.04-x64 + LibreOffice 3 and MS free except the boss's Notebook which runs XP + OOo 3.3.
TerryE
Volunteer
Posts: 1401
Joined: Sat Oct 06, 2007 10:13 pm
Location: UK

Re: Using COM to integrate Powerpoint and Impress

Post by TerryE »

Placeholder for Change Control and to remove topic from "View unanswered posts" list.
Ubuntu 11.04-x64 + LibreOffice 3 and MS free except the boss's Notebook which runs XP + OOo 3.3.
Post Reply