[Solved] Processing XML data with OpenOfice.org Basic

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
User avatar
Lazy-legs
Posts: 71
Joined: Mon Oct 08, 2007 1:33 am
Location: Århus-Berlin

[Solved] Processing XML data with OpenOfice.org Basic

Post by Lazy-legs »

Hello,

I have a rather simple problem, but I have a feeling that it requires a not so simple solution. I'd like to grab XML data from an RSS feed (e.g., http://identi.ca/dmpop/rss) and insert it into a Writer document. I know that OpenOffice.org Basic can fetch and process XML data, but I have absolutely no idea how this can be done.

Thank you!

Kind regards,
Dmitri
Last edited by Lazy-legs on Fri Apr 10, 2009 4:08 pm, edited 1 time in total.
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Processing XML data with OpenOfice.org Basic

Post by Villeroy »

Most programming languages can fetch XML data from the web and parse it. StarBasic can't since it is just for calling the office-API in a convenient way. It has almost no interfaces to the "outside world". It has no XML parsers like any other modern programming language.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Lazy-legs
Posts: 71
Joined: Mon Oct 08, 2007 1:33 am
Location: Århus-Berlin

Re: Processing XML data with OpenOfice.org Basic

Post by Lazy-legs »

Hi Villeroy,

Thanks for your reply. Not to question your knowledge on the subject, but it looks like that the Lorem ipsum generator macro does exactly that: it fetches and processes XML data. Or am I completely wrong on this one?

Kind regards,
Dmitri
hol.sten
Volunteer
Posts: 495
Joined: Mon Oct 08, 2007 1:31 am
Location: Hamburg, Germany

Re: Processing XML data with OpenOfice.org Basic

Post by hol.sten »

Lazy-legs wrote:it looks like that the Lorem ipsum generator macro does exactly that: it fetches and processes XML data. Or am I completely wrong on this one?
No, you're not. But why are you asking at all with the knowledge of this fine example? To get XML processing working in OOo Basic all you have to do is to take a closer look at the content of the OOo extension LoremIpsum150.oxt. If you every wrote a SAX parser in another programming language, you'll find a quite similar example of that in OOo Basic inside the file Generate/LoremIpsum.xba in the extension (simply unzip the content):

Code: Select all

...
Sub ReadXmlFromInputStream( oInputStream )
  oSaxParser = createUnoService( "com.sun.star.xml.sax.Parser" )
  oDocEventsHandler = CreateDocumentHandler()
  oSaxParser.setDocumentHandler( oDocEventsHandler )
  oInputSource = createUnoStruct( "com.sun.star.xml.sax.InputSource" )
  With oInputSource
    .aInputStream = oInputStream   ' plug in the input stream
  End With

  oSaxParser.parseStream( oInputSource )
End Sub

'==================================================
'   Xml Sax document handler.
'==================================================
Private goLocator As Object
Private glLocatorSet As Boolean

Function CreateDocumentHandler()
  oDocHandler = CreateUnoListener( "DocHandler_", "com.sun.star.xml.sax.XDocumentHandler" )
  glLocatorSet = False
  CreateDocumentHandler() = oDocHandler
End Function

'==================================================
'   Methods of our document handler call these
'    global functions.
'   These methods look strangely similar to
'    a SAX event handler.  ;-)
'   These global routines are called by the Sax parser
'    as it reads in an XML document.
'   These subroutines must be named with a prefix that is
'    followed by the event name of the com.sun.star.xml.sax.XDocumentHandler interface.
'==================================================

Sub DocHandler_characters( cChars As String )
  if xNode = "lipsum" then 
    oWrite=1
    cChars= Left(cChars,len(cChars)-1)
    if len(cChars)>1 then
      cChars= cChars+ Chr$(13)
    else
      cChars=cChars
    endif
    WriteLoremipsum (cChars, oWrite)
  Else
    oWrite=0
  Endif	
End Sub

Sub DocHandler_ignorableWhitespace( cWhitespace As String )
End Sub

Sub DocHandler_processingInstruction( cTarget As String, cData As String )
End Sub

Sub DocHandler_startDocument()
  'Print "Start document"
End Sub

Sub DocHandler_endDocument()
  'Print "End document"
End Sub

Sub DocHandler_startElement( cName As String, oAttributes As com.sun.star.xml.sax.XAttributeList )
  'Print "Start element............", cName
  xNode = cName
End Sub

Sub DocHandler_endElement( cName As String )
  'Print "End element", cName
End Sub

Sub DocHandler_setDocumentLocator( oLocator As com.sun.star.xml.sax.XLocator )
  ' Save the locator object in a global variable.
  ' The locator object has valuable methods that we can
  ' call to determine
  goLocator = oLocator
  glLocatorSet = True
End Sub
...
I got this source code from the above mentioned OOo extension. All I did was deleting some white spaces from the original source code part.

What part of your first question is still unanswered after a closer look into the OOo Basic source code of LoremIpsum150.oxt?
OOo 3.2.0 on Ubuntu 10.04 • OOo 3.2.1 on Windows 7 64-bit and MS Windows XP
User avatar
Lazy-legs
Posts: 71
Joined: Mon Oct 08, 2007 1:33 am
Location: Århus-Berlin

Re: Processing XML data with OpenOfice.org Basic

Post by Lazy-legs »

Hi hol.sten,

Thank you for your reply.
No, you're not. But why are you asking at all with the knowledge of this fine example?
I'm asking because I'm not a programmer, and while I can see that it should be possible to process XML data, I can't figure out how. I did take a look at the Lorem ipsum generator macro, but I found it too complicated for my limited skills.
What part of your first question is still unanswered after a closer look into the OOo Basic source code of LoremIpsum150.oxt?
Just one: how can I actually use the code snippet you posted to fetch the http://identi.ca/dmpop/rss RSS feed and insert it into a new Writer document?

Thank you!

Kind regards,
Dmitri
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Processing XML data with OpenOfice.org Basic

Post by Villeroy »

Lazy-legs wrote:Hi Villeroy,

Thanks for your reply. Not to question your knowledge on the subject, but it looks like that the Lorem ipsum generator macro does exactly that: it fetches and processes XML data. Or am I completely wrong on this one?

Kind regards,
Dmitri
Basic does not fetch nor parse anything. It delegates everything to the office suite by calling it's services "com.sun.star.ucb.SimpleFileAccess" and "com.sun.star.xml.sax.Parser".
Using Linux, the same thing can be implemented in many different programming languages without needing a 300 MB office suite. It would not even take a desktop system to parse some input parameters (amount&what&start for lipsum.com), fetch the data from the other machine using the http protocol and finally parse the incoming XML one way or the other.
The question is: What is the reason that you want to have the result in a Word processor? It's obvious in respect to the Laurem Ipsum, but for me it's not that obvious what you want to do with the URLs generated by http://bit.ly/ (I assume this is still the same topic as in the other thread) in a Word processor or spreadsheet.
Well, I can imagine several things to could do with the received data, but I why write a program that can work only within the extremely narrow context of this specific Word processor or spreadsheet?
Lazy-legs wrote:I'm asking because I'm not a programmer, and while I can see that it should be possible to process XML data, I can't figure out how. I did take a look at the Lorem ipsum generator macro, but I found it too complicated for my limited skills.
Your "lazy legs" in combination with a "diligent mouth" seems to be a problem. You invest a huge amount of time and energy on oooforum.org and right here to nag people until they drop the snippet you want.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Lazy-legs
Posts: 71
Joined: Mon Oct 08, 2007 1:33 am
Location: Århus-Berlin

Re: Processing XML data with OpenOfice.org Basic

Post by Lazy-legs »

The question is: What is the reason that you want to have the result in a Word processor? It's obvious in respect to the Laurem Ipsum, but for me it's not that obvious what you want to do with the URLs generated by http://bit.ly/ (I assume this is still the same topic as in the other thread) in a Word processor or spreadsheet.
The reason is simple. Dealing with long URLs in a Writer documents is notoriously tricky. So I'd like to be able to shorten them using a macro.
Well, I can imagine several things to could do with the received data, but I why write a program that can work only within the extremely narrow context of this specific Word processor or spreadsheet?
Who said that you have to write anything at all?
Your "lazy legs" in combination with a "diligent mouth" seems to be a problem. You invest a huge amount of time and energy on oooforum.org and right here to nag people until they drop the snippet you want.
I'm sorry you feel this way, especially considering how much you helped me and other forum users. The last thing I want is to nag people. I ask questions in hope to get a usable answer. Isn't that what the forums are for? Anyway, I apologize if I said or did something wrong.

Kind regards,
Dmitri
hol.sten
Volunteer
Posts: 495
Joined: Mon Oct 08, 2007 1:31 am
Location: Hamburg, Germany

Re: Processing XML data with OpenOfice.org Basic

Post by hol.sten »

Lazy-legs wrote:Just one: how can I actually use the code snippet you posted to fetch the http://identi.ca/dmpop/rss RSS feed and insert it into a new Writer document?
Give this a try:

Code: Select all

REM  *****  BASIC  *****

Dim thisDoc As Object
Dim cXmlUrl as String
Dim sNode as String

Sub Main
  thisDoc=ThisComponent
  cXmlUrl = ConvertToURL( "http://identi.ca/dmpop/rss" )
  ReadXmlFromUrl( cXmlUrl )
End Sub

Sub ReadXmlFromUrl( cUrl )
  oSimpleFileAccess = createUnoService( "com.sun.star.ucb.SimpleFileAccess" )
  oInputStream = oSimpleFileAccess.openFileRead( cUrl )
  ReadXmlFromInputStream( oInputStream )
  oInputStream.closeInput()
End Sub

Sub ReadXmlFromInputStream( oInputStream )
  oSaxParser = createUnoService( "com.sun.star.xml.sax.Parser" )
  oDocEventsHandler = CreateDocumentHandler()
  oSaxParser.setDocumentHandler( oDocEventsHandler )
  oInputSource = createUnoStruct( "com.sun.star.xml.sax.InputSource" )
  ' plug in the input stream
  With oInputSource
    .aInputStream = oInputStream
  End With
  oSaxParser.parseStream( oInputSource )
End Sub

'==================================================
'   Xml Sax document handler.
'==================================================
Function CreateDocumentHandler()
   oDocHandler = CreateUnoListener( "DocHandler_", "com.sun.star.xml.sax.XDocumentHandler" )
   CreateDocumentHandler() = oDocHandler
End Function

'==================================================
'   Methods of our document handler call these
'    global functions.
'   These methods look strangely similar to
'    a SAX event handler.  ;-)
'   These global routines are called by the Sax parser
'    as it reads in an XML document.
'   These subroutines must be named with a prefix that is
'    followed by the event name of the com.sun.star.xml.sax.XDocumentHandler interface.
'==================================================
Sub DocHandler_startDocument()
End Sub

Sub DocHandler_endDocument()
End Sub

Sub DocHandler_startElement( cName As String, oAttributes As com.sun.star.xml.sax.XAttributeList )
  sNode = cName
End Sub

Sub DocHandler_endElement( cName As String )
End Sub

Sub DocHandler_characters( cChars As String )
  If sNode = "title" Or sNode = "link" Then
    If len(cChars) > 0 Then
      WriteToDocument ( cChars )
    Endif
  Elseif sNode = "description" Then
    WriteToDocument ( "" + Chr$(13) )
  Endif
End Sub

Sub DocHandler_setDocumentLocator( oLocator As com.sun.star.xml.sax.XLocator )
End Sub 

Sub DocHandler_ignorableWhitespace( cWhitespace As String )
End Sub

Sub DocHandler_processingInstruction( cTarget As String, cData As String )
End Sub
'==================================================

Sub WriteToDocument( sText As String )
  Dim oDoc As Object
  Dim oText As Object
  Dim oVCurs As Object
  Dim oTCurs As Object

  oDoc = ThisComponent
  oText = oDoc.Text
  oVCurs = oDoc.CurrentController.getViewCursor()
  oTCurs = oText.createTextCursorByRange(oVCurs.getStart())
  oText.insertString(oTCurs, sText, FALSE)
End Sub
At least some code to start with.

Here on Ubuntu with OOo 2.4.1 the OOo Basic macro created this document:
dmpop
http://identi.ca/dmpop



dmpop
http://identi.ca/dmpop
dmpop: HADOPI, French "three strikes" law rejected. http://bit.ly/PtQ
http://identi.ca/notice/3347585



dmpop: I didn't realize how much I depend on Chandler until Chander Hub went down. #chandler # musthavetool
http://identi.ca/notice/3331289



dmpop: Font of the day: MISO http://omkrets.se/typografi/
http://identi.ca/notice/3330795



dmpop: OpenOffice.org Native MySQL Driver Alpha Build available http://bit.ly/l52Vm #openoffice.org #mysql
http://identi.ca/notice/3330546
...
OOo 3.2.0 on Ubuntu 10.04 • OOo 3.2.1 on Windows 7 64-bit and MS Windows XP
User avatar
Lazy-legs
Posts: 71
Joined: Mon Oct 08, 2007 1:33 am
Location: Århus-Berlin

Re: Processing XML data with OpenOfice.org Basic

Post by Lazy-legs »

This is perfect! Thank you very much, hol.sten! I think I can figure the rest myself. :-)

Kind regards,
Dmitri
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Solved] Processing XML data with OpenOfice.org Basic

Post by acknak »

Alternatively, you could load your RSS feed directly into Writer using an XSLT filter, with no macros needed.
AOO4/LO5 • Linux • Fedora 23
zhivko
Posts: 7
Joined: Mon Feb 22, 2010 11:29 am

Re: [Solved] Processing XML data with OpenOfice.org Basic

Post by zhivko »

This way is much more easier now in 4.2:
svc = createUnoService( "com.sun.star.sheet.FunctionAccess" ) 'Create a service to use Calc functions
XML_String = svc.callFunction("WEBSERVICE",array("http://www.lipsum.com/feed/xml?amount=2 ... &start=Yes"))
Lipsum = svc.callFunction("FILTERXML", array(XML_String, "/feed/lipsum" ))
Print Lipsum

Also advantage of this is that you are not limited to msxml if you are on unix platform :)
OpenOffice 3.1 on Windows XP
Post Reply