[Solved] [Python] Read Writer text paragraph by paragraph

Java, C++, C#, Delphi, ??? - Using the UNO bridges

[Solved] [Python] Read Writer text paragraph by paragraph

Postby _savage » Wed Aug 07, 2013 1:52 pm

I would like to remotely read the content of a loaded document, paragraph by paragraph. After the initial connect to OO

Code: Select all   Expand viewCollapse view
>>> import uno
>>> local = uno.getComponentContext()
>>> resolver = local.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", local)
>>> context = resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
>>> desktop = context.ServiceManager.createInstanceWithContext("com.sun.star.frame.Desktop", context)
>>> document = desktop.loadComponentFromURL("file:///bla.doc", "_blank", 0, ())
>>> cursor = document.Text.createTextCursor()

So far I've gotten to

Code: Select all   Expand viewCollapse view
>>> document.Text.getString()

but that gives me the whole text plainly.

What I would like to do is go from paragraph to paragraph, query the kind/type of the paragraph, and then read the text for that one paragraph only. Even better, could I noodle through the text of the paragraph and find formatting, like bolding or italic?

Any help is appreciated :) Thanks!
Last edited by Hagar Delest on Sat Aug 10, 2013 11:12 pm, edited 2 times in total.
Reason: tagged [Solved].
Mac 10.11 using LO 5.3.6.1, Gentoo Linux using LO 5.3.4.2 headless.
_savage
 
Posts: 165
Joined: Sun Apr 21, 2013 12:55 am

Re: [Python] How to read text from Writer paragraph by parag

Postby FJCC » Wed Aug 07, 2013 2:59 pm

This basic code will produce a Portion for every differently formatted section of text in a document.
Code: Select all   Expand viewCollapse view
oText = ThisComponent.Text
ParaEnum = oText.createEnumeration() 'makes a collection of paragraphs.
While ParaEnum.hasMoreElements()
   Para = ParaEnum.nextElement()
   PortionEnum = Para.createEnumeration()
   While PortionEnum.hasMoreElements()
      Portion = PortionEnum.nextElement()
      Print Portion.String
   Wend
Wend

My oText variable is equivalent to your document.Text. The Python code would look much the same. Of course, you don't want to just print the Portion, but I hope that gets you started
Windows 10 and Linux Mint, since 2017
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
FJCC
Moderator
 
Posts: 7133
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: [Python] How to read text from Writer paragraph by parag

Postby _savage » Thu Aug 08, 2013 11:04 pm

Thank you, FJCC, that's exactly what I was looking for :bravo:

For completeness' sake, here's the equivalent Python code
Code: Select all   Expand viewCollapse view
    parenum = document.Text.createEnumeration()
    while parenum.hasMoreElements() :
        par = parenum.nextElement()
        # check par.ParaStyleName here for Heading or body text or any other paragraph styling
        textenum = par.createEnumeration()
        while textenum.hasMoreElements() :
            text = textenum.nextElement()
            # check text.CharPosture and text.CharWeight and other properties here
            print(text.getString())
Mac 10.11 using LO 5.3.6.1, Gentoo Linux using LO 5.3.4.2 headless.
_savage
 
Posts: 165
Joined: Sun Apr 21, 2013 12:55 am


Return to External Programs

Who is online

Users browsing this forum: No registered users and 3 guests