Hello everyone,
I want to extract comments from a .odt file and create a csv/ods file out of the following extracted fields:
Section number (if present)
Text on which comment was given
Reviewer's name
Comment
I know one way is to 1) Convert .odt to .zip 2) Unzip .zip 3) Parse contents.xml. Not so good at xml.
Can this be done using a macro/StarBasic script in Open-office?
Please help.
Thanks
SB Script/Macro to extract comments from odt to csv/ods
SB Script/Macro to extract comments from odt to csv/ods
Last edited by kritadhi on Tue May 10, 2011 6:44 am, edited 1 time in total.
Regards,
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
- MrProgrammer
- Moderator
- Posts: 5258
- Joined: Fri Jun 04, 2010 7:57 pm
- Location: Wisconsin, USA
Re: SB Script/Macro to extract comments from odt to csv/ods
Hi, and welcome to the forum.
I'm sure you can do this, but it will presumably require a macro if you don't want to extract the data from contents.xml, and writing macros requires quite a bit of study. I think that people typically suggest a couple of months is needed to write non-trivial macros. If you want to go that route see http://www.pitonyak.org/AndrewMacro.odt, section 7.29.1. (Enumerate text fields) where he displays a macro that seems to be close to what you might need. I believe the service of interest is com.sun.star.text.TextField.Annotation. It is not obvious to me how one would determine which section the comment is in. This is no doubt something that doesn't require much coding and will appear simple once one understands how documents are organized internally, but text documents are not simple and acquiring that understanding is what requires all the study. You could also search the Customizing and Extending section of this forum for relevant topics.
I'm sure you can do this, but it will presumably require a macro if you don't want to extract the data from contents.xml, and writing macros requires quite a bit of study. I think that people typically suggest a couple of months is needed to write non-trivial macros. If you want to go that route see http://www.pitonyak.org/AndrewMacro.odt, section 7.29.1. (Enumerate text fields) where he displays a macro that seems to be close to what you might need. I believe the service of interest is com.sun.star.text.TextField.Annotation. It is not obvious to me how one would determine which section the comment is in. This is no doubt something that doesn't require much coding and will appear simple once one understands how documents are organized internally, but text documents are not simple and acquiring that understanding is what requires all the study. You could also search the Customizing and Extending section of this forum for relevant topics.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.7.5, iMac Intel. The locale for any menus or Calc formulas in my posts is English (USA).
AOO 4.1.7 Build 9800, MacOS 13.7.5, iMac Intel. The locale for any menus or Calc formulas in my posts is English (USA).
Re: SB Script/Macro to extract comments from odt to csv/ods
Thanks a lot for the direction..
I could write up this much.. just wanted to see if I can read the annotations from the document
It gives error "Found 535 unexpected field types". I have some comments embedded in the document in which the macro is. Any idea why the macro is not detecting the annotations?
I could write up this much.. just wanted to see if I can read the annotations from the document
Code: Select all
Sub ExtractAnnotations(Optional oUseDoc)
Dim oDoc 'Document to use.
Dim s$ 'Generic string variable.
Dim nUnknown% 'Count the unexpected field types.
Dim oEnum 'Enumeration of the text fields.
Dim oField 'Enumerated text field.
If IsMissing(oUseDoc) Then
oDoc = ThisComponent
Else
oDoc = oUseDoc
End If
oEnum = oDoc.getTextFields().createEnumeration()
If IsNull(oEnum) Then
Print "getTextFields().createEnumeration() returns NULL"
Exit Sub
End If
Do While oEnum.hasMoreElements()
oField = oEnum.nextElement()
If oField.supportsService("com.sun.star.text.TextField.Annotation") Then
s = s & oField.Author & oField.Content
MsgBox s, 0, "Text Fields"
s = ""
Else
nUnknown = nUnknown + 1
End If
Loop
If nUnknown > 0 Then Print "Found " & nUnknown & " unexpected field types"
End Sub
Regards,
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
Re: SB Script/Macro to extract comments from odt to csv/ods
My mistake. The code that I wrote works.. as in, it displays all the annotations one by one
Now target is to create CSV out of it.
How to read the following is still a mystery
- Section number (if present)
- Text on which comment was given

How to read the following is still a mystery
- Section number (if present)
- Text on which comment was given
Regards,
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
Re: SB Script/Macro to extract comments from odt to csv/ods
I reached here....
Code: Select all
REM Author: Kritadhi Chakraborty
Sub ExtractAnnotations(Optional oUseDoc)
Dim oDoc 'Document to use.
Dim s$ 'Generic string variable.
Dim n% 'Count the number of Annotation text fields.
Dim oEnum 'Enumeration of the text fields.
Dim oField 'Enumerated text field.
Dim FileName As String
Dim sDocURL
Dim fileNo As Integer
Dim oViewCursor 'Current view cursor
Dim oAnchor
If IsMissing(oUseDoc) Then
oDoc = ThisComponent
Else
oDoc = oUseDoc
End If
'create file name
If (Not GlobalScope.BasicLibraries.isLibraryLoaded("Tools")) Then
GlobalScope.BasicLibraries.LoadLibrary("Tools")
End If
If (oDoc.hasLocation()) Then
sDocURL = oDoc.getURL()
End If
Filename = ConvertToURL(CurDir) & "/" & FileNameoutofPath(sDocURL, "/") & ".csv"
'MsgBox Filename, 0, "Filename"
fileNo = FreeFile()
Open FileName For Output Access Read Write As #fileNo
Write #fileNo, "No,Filename,Page,Reviewer,Comment,Response"
'main logic
oEnum = oDoc.getTextFields().createEnumeration()
If IsNull(oEnum) Then
Print "getTextFields().createEnumeration() returns NULL"
Exit Sub
End If
Do While oEnum.hasMoreElements()
oField = oEnum.nextElement()
If oField.supportsService("com.sun.star.text.TextField.Annotation") Then
oAnchor = oField.getAnchor()
ThisComponent.CurrentController.select(oField)
oViewCursor = ThisComponent.getCurrentController().getViewCursor()
oViewCursor.gotoRange(oAnchor, False)
n = n + 1
s = s & n & "," & FileNameoutofPath(sDocURL, "/") & "," & oViewCursor.getPage() & "," & oField.Author & "," & oField.Content
'MsgBox s, 0, "Review Comment"
Write #fileNo, s
s = ""
End If
Loop
MsgBox Filename, 0, "Review Comment Sheet"
Close #fileNo
End Sub
Regards,
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu
Kritadhi
OpenOffice.org 3.3 Windowx XP
OpenOffice.org 3.2.1 (Build:9505) Ubuntu