SB Script/Macro to extract comments from odt to csv/ods

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
kritadhi
Posts: 5
Joined: Mon May 09, 2011 11:14 am

SB Script/Macro to extract comments from odt to csv/ods

Post by kritadhi »

Hello everyone,

I want to extract comments from a .odt file and create a csv/ods file out of the following extracted fields:

Section number (if present)
Text on which comment was given
Reviewer's name
Comment


I know one way is to 1) Convert .odt to .zip 2) Unzip .zip 3) Parse contents.xml. Not so good at xml.

Can this be done using a macro/StarBasic script in Open-office?

Please help.

Thanks
Last edited by kritadhi on Tue May 10, 2011 6:44 am, edited 1 time in total.
Regards,
Kritadhi

OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
User avatar
MrProgrammer
Moderator
Posts: 5258
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: SB Script/Macro to extract comments from odt to csv/ods

Post by MrProgrammer »

Hi, and welcome to the forum.

I'm sure you can do this, but it will presumably require a macro if you don't want to extract the data from contents.xml, and writing macros requires quite a bit of study. I think that people typically suggest a couple of months is needed to write non-trivial macros. If you want to go that route see http://www.pitonyak.org/AndrewMacro.odt, section 7.29.1. (Enumerate text fields) where he displays a macro that seems to be close to what you might need. I believe the service of interest is com.sun.star.text.TextField.Annotation. It is not obvious to me how one would determine which section the comment is in. This is no doubt something that doesn't require much coding and will appear simple once one understands how documents are organized internally, but text documents are not simple and acquiring that understanding is what requires all the study. You could also search the Customizing and Extending section of this forum for relevant topics.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.7.5, iMac Intel.   The locale for any menus or Calc formulas in my posts is English (USA).
kritadhi
Posts: 5
Joined: Mon May 09, 2011 11:14 am

Re: SB Script/Macro to extract comments from odt to csv/ods

Post by kritadhi »

Thanks a lot for the direction..

I could write up this much.. just wanted to see if I can read the annotations from the document

Code: Select all

Sub ExtractAnnotations(Optional oUseDoc)
  Dim oDoc      'Document to use.
  Dim s$        'Generic string variable.
  Dim nUnknown% 'Count the unexpected field types.
  Dim oEnum     'Enumeration of the text fields.
  Dim oField    'Enumerated text field.

  If IsMissing(oUseDoc) Then
    oDoc = ThisComponent
  Else
    oDoc = oUseDoc
  End If

  oEnum = oDoc.getTextFields().createEnumeration()
  If IsNull(oEnum) Then
    Print "getTextFields().createEnumeration() returns NULL"
    Exit Sub
  End If

  Do While oEnum.hasMoreElements()
    oField = oEnum.nextElement()
    If oField.supportsService("com.sun.star.text.TextField.Annotation") Then
      s = s & oField.Author & oField.Content
      MsgBox s, 0, "Text Fields"
      s = ""
    Else
      nUnknown = nUnknown + 1
    End If
  Loop

  If nUnknown > 0 Then Print "Found "  & nUnknown & " unexpected field types"

End Sub
It gives error "Found 535 unexpected field types". I have some comments embedded in the document in which the macro is. Any idea why the macro is not detecting the annotations?
Regards,
Kritadhi

OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
kritadhi
Posts: 5
Joined: Mon May 09, 2011 11:14 am

Re: SB Script/Macro to extract comments from odt to csv/ods

Post by kritadhi »

My mistake. The code that I wrote works.. as in, it displays all the annotations one by one :-) Now target is to create CSV out of it.

How to read the following is still a mystery

- Section number (if present)
- Text on which comment was given
Regards,
Kritadhi

OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
kritadhi
Posts: 5
Joined: Mon May 09, 2011 11:14 am

Re: SB Script/Macro to extract comments from odt to csv/ods

Post by kritadhi »

I reached here....

Code: Select all

REM Author: Kritadhi Chakraborty
Sub ExtractAnnotations(Optional oUseDoc)
  Dim oDoc      'Document to use.
  Dim s$        'Generic string variable.
  Dim n%        'Count the number of Annotation text fields.
  Dim oEnum     'Enumeration of the text fields.
  Dim oField    'Enumerated text field.
  Dim FileName As String
  Dim sDocURL
  Dim fileNo As Integer
  Dim oViewCursor 'Current view cursor
  Dim oAnchor
  
  If IsMissing(oUseDoc) Then
    oDoc = ThisComponent
  Else
    oDoc = oUseDoc
  End If
  
  'create file name
  If (Not GlobalScope.BasicLibraries.isLibraryLoaded("Tools")) Then
    GlobalScope.BasicLibraries.LoadLibrary("Tools")
  End If
  
  If (oDoc.hasLocation()) Then
    sDocURL = oDoc.getURL()
  End If
  
  Filename = ConvertToURL(CurDir) & "/" & FileNameoutofPath(sDocURL, "/") & ".csv"
  
  'MsgBox Filename, 0, "Filename"
  
  fileNo = FreeFile()
  Open FileName For Output Access Read Write As #fileNo

  Write #fileNo, "No,Filename,Page,Reviewer,Comment,Response"
  
  'main logic
  oEnum = oDoc.getTextFields().createEnumeration()
  If IsNull(oEnum) Then
    Print "getTextFields().createEnumeration() returns NULL"
    Exit Sub
  End If

  Do While oEnum.hasMoreElements()
    oField = oEnum.nextElement()

    If oField.supportsService("com.sun.star.text.TextField.Annotation") Then
      oAnchor = oField.getAnchor()

      ThisComponent.CurrentController.select(oField)
      oViewCursor = ThisComponent.getCurrentController().getViewCursor()
      oViewCursor.gotoRange(oAnchor, False)

      n = n + 1
      s = s & n & "," & FileNameoutofPath(sDocURL, "/") & "," & oViewCursor.getPage() & "," & oField.Author & "," & oField.Content
      'MsgBox s, 0, "Review Comment"
      Write #fileNo, s
      s = ""
    End If
  
  Loop
  
  MsgBox Filename, 0, "Review Comment Sheet"
  
  Close #fileNo

End Sub
Regards,
Kritadhi

OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
Post Reply