[OpenOffice][Writer]Retrieving text from a Writer document

Creating a macro - Writing a Script - Using the API

[OpenOffice][Writer]Retrieving text from a Writer document

Postby meltigel » Fri Jun 14, 2019 10:32 am

Hello everyone,
maybe I'm posting an obvious question, but how I can retrieve the text of a Writer document?

I have written a basic VB application. My goal is to insert in a Writer document some escape sequences and, when I open this document inside my application (in which there is a table with the correspondence "escape sequence" --> "sentence"), the "macro" needs to read the text in the document, check the table, and substitute the correspondence back in the document.

I have understood that there are two cursors, but the usage is not very clear to me. Someone can explain me how to read from the document, or point me to some guide/tutorial/documents?

Thank you everyone
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby RoryOF » Fri Jun 14, 2019 10:38 am

The definitive work on OO macros is downloadable from http://www.pitonyak.org/oo.php

Note that you should forget VB, OpenOffice BASIC is dissimilar at the lower levels one needs for document manipulation.
Apache OpenOffice 4.1.6 on Xubuntu 18.04.2 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 29269
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby JeJe » Fri Jun 14, 2019 11:20 am

If I'm following you then you want to replace a unique string in a document with a string from elsewhere? If that's it then you can use a search descriptor as follows - this will replace the first occurance of "fish" with "gas"

Code: Select all   Expand viewCollapse view
Sub Main

mySearch = thisComponent.createSearchDescriptor()
mySearch.searchString = "fish"
mySearch.searchRegularExpression = false
myResult = thisComponent.findfirst(mySearch)
 
if Not IsNull(myResult) then
myResult.string = "gas"
End If
End Sub
Openoffice 4.1.2
Windows 8
JeJe
Volunteer
 
Posts: 551
Joined: Wed Mar 09, 2016 2:40 pm

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 11:30 am

RoryOF wrote:The definitive work on OO macros is downloadable from http://www.pitonyak.org/oo.php

Note that you should forget VB, OpenOffice BASIC is dissimilar at the lower levels one needs for document manipulation.


Already saw his work, but I am not able to find what I need there...
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 11:33 am

JeJe wrote:If I'm following you then you want to replace a unique string in a document with a string from elsewhere? If that's it then you can use a search descriptor as follows - this will replace the first occurance of "fish" with "gas"

Code: Select all   Expand viewCollapse view
Sub Main

mySearch = thisComponent.createSearchDescriptor()
mySearch.searchString = "fish"
mySearch.searchRegularExpression = false
myResult = thisComponent.findfirst(mySearch)
 
if Not IsNull(myResult) then
myResult.string = "gas"
End If
End Sub


The problem is that I have plenty of escape sequences, is not a fixed word... I already knew about the find&replace, but it is not what I need, because I need to evaluate what is written, and replace the correct word in according of this... For example, "fish"->"car", "dog"->"wheel", and so on. Also, the user can add more words, so it is not a fixed thing...
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby JeJe » Fri Jun 14, 2019 11:42 am

If you want the whole text of the document its as simple as

wholetext = ThisComponent.text.string
ThisComponent.text.string = newtext

But the result needs to be small enough to fit into a string variable. You can also enumerate the paragraphs.

Otherwise this page shows you how to use a text cursor:

https://wiki.openoffice.org/wiki/Writer/API/Text_cursor
Openoffice 4.1.2
Windows 8
JeJe
Volunteer
 
Posts: 551
Joined: Wed Mar 09, 2016 2:40 pm

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 12:25 pm

JeJe wrote:If you want the whole text of the document its as simple as

wholetext = ThisComponent.text.string
ThisComponent.text.string = newtext

But the result needs to be small enough to fit into a string variable. You can also enumerate the paragraphs.

Otherwise this page shows you how to use a text cursor:

https://wiki.openoffice.org/wiki/Writer/API/Text_cursor


This code works but, as explained, retrieves ALL the text... There is a way to cycle through all the words and process them one at a time?
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby Lupp » Fri Jun 14, 2019 12:51 pm

Concerning questions talking unspecifically of text and asking for macro guidance: What is seen as text can have a complex structure. In specific there may be frames, nested frames, frames nested into table cells, (... of tables nested into frames...).
If you can assure your "text" is the body text of the document exclusively, please do so explicitly. "No tables at all" would also be helpful in many cases.
On Windows 10: LibreOffice 6.2 and older versions, PortableOpenOffice 4.1.5 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2523
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 12:55 pm

Lupp wrote:Concerning questions talking unspecifically of text and asking for macro guidance: What is seen as text can have a complex structure. In specific there may be frames, nested frames, frames nested into table cells, (... of tables nested into frames...).
If you can assure your "text" is the body text of the document exclusively, please do so explicitly. "No tables at all" would also be helpful in many cases.


The document is mixed... there is "plain text" and also tables. But I red some documentation and I saw that the table text can be accessed in the same way of plain text. There aren't frames or nested things. Only text and tables.
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby Lupp » Fri Jun 14, 2019 1:04 pm

Do you need the lookup table for replacements inside the Writer doc? Tables are much easier handled in Calc docs.
Use a replace descriptor as already suggesed and loop through your table. The usage will need to be a bit more elaborate, however, I'm afraid. Anyway a few 100 'Replace All' for the document may be much more efficient (and less error-prone) than simulating code for all of them at a time written in Basic.
The one hardly dispensable condition is: No replacement can create a new "escape sequence" not yet finally processed.

You may find most of the needed information here:
https://api.libreoffice.org/docs/idl/re ... iptor.html
On Windows 10: LibreOffice 6.2 and older versions, PortableOpenOffice 4.1.5 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2523
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby JeJe » Fri Jun 14, 2019 1:09 pm

You can go through each word at a time by creating a text cursor and using gotoNextWord - see the link I posted.

Or you can go through each paragraph in turn with an enumeration - from Andrew Pitonyak's book

Code: Select all   Expand viewCollapse view
Sub EnumerateParagraphs
REM Author: Andrew Pitonyak
Dim oParEnum 'Enumerator used to enumerate the paragraphs
Dim oPar 'The enumerated paragraph
REM Enumerate the paragraphs.
REM Tables are enumerated along with paragraphs
oParEnum = ThisComponent.getText().createEnumeration()
Do While oParEnum.hasMoreElements()
oPar = oParEnum.nextElement()
REM This avoids the tables. Add an else statement if you want to
REM process the tables.
If oPar.supportsService("com.sun.star.text.Paragraph") Then
MsgBox oPar.getString(), 0, "I found a paragraph"
ElseIf oPar.supportsService("com.sun.star.text.TextTable") Then
Print "I found a TextTable"
Else
Print "What did I find?"
End If
Loop
End Sub
Last edited by JeJe on Fri Jun 14, 2019 1:12 pm, edited 1 time in total.
Openoffice 4.1.2
Windows 8
JeJe
Volunteer
 
Posts: 551
Joined: Wed Mar 09, 2016 2:40 pm

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby RoryOF » Fri Jun 14, 2019 1:12 pm

Set the find and replace strings as a dictionary using Python, then for all words in dictionary replace the text with the definition (the replace) string. Simple to write a small routine to extend the dictionary for new words/replacements.

Sample code given in
https://stackoverflow.com/questions/20502862/python-replacing-words-in-a-string-with-entries-from-a-dictionary

https://www.daniweb.com/programming/software-development/code/216636/multiple-word-replace-in-text-python
Apache OpenOffice 4.1.6 on Xubuntu 18.04.2 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 29269
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 3:48 pm

Lupp wrote:Do you need the lookup table for replacements inside the Writer doc? Tables are much easier handled in Calc docs.
Use a replace descriptor as already suggesed and loop through your table. The usage will need to be a bit more elaborate, however, I'm afraid. Anyway a few 100 'Replace All' for the document may be much more efficient (and less error-prone) than simulating code for all of them at a time written in Basic.
The one hardly dispensable condition is: No replacement can create a new "escape sequence" not yet finally processed.

You may find most of the needed information here:
https://api.libreoffice.org/docs/idl/re ... iptor.html


The table needs to be stored inside the VB application, because no one has to see and manipulate it. Will try with the enumeration...
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 3:54 pm

RoryOF wrote:Set the find and replace strings as a dictionary using Python, then for all words in dictionary replace the text with the definition (the replace) string. Simple to write a small routine to extend the dictionary for new words/replacements.

Sample code given in
https://stackoverflow.com/questions/20502862/python-replacing-words-in-a-string-with-entries-from-a-dictionary

https://www.daniweb.com/programming/software-development/code/216636/multiple-word-replace-in-text-python



Cannot use Python, because I already used VB and need to stick with it
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby Villeroy » Fri Jun 14, 2019 4:17 pm

For VB or VBA you need a Microsoft product.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04, no OpenOffice, LibreOffice 6.x
User avatar
Villeroy
Volunteer
 
Posts: 26975
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 4:21 pm

I know, but for wide distribution I need more compatibility, hence the use of OO Automation...
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby RoryOF » Fri Jun 14, 2019 4:55 pm

Be aware that OpenOffice code will not work on Microsoft Office, and contrariwise.
Apache OpenOffice 4.1.6 on Xubuntu 18.04.2 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 29269
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby Lupp » Fri Jun 14, 2019 5:04 pm

Only on a very low level VBA (for Excel e.g.) can be moved to AOO / LibO ( bit better to LibO). User code relying on the API basically will rarely (next to never) be movable to VBA. If you need VBA, you need MS Office. And that's the purpose of VBA: User lock-in. Though AOO / LibO don't intend this in the same sense as the relevant commercial competitor does, the effect is anavoidably also present the other way.

I made a demo consisting of 3 files:
1) The processing file contaoning the replacement table and the code. (lookupAndProcess.ods)
2) A small text file as the example to work on. (textForProcessing.odt)
3) The result obtained by running the sub 'processIt' from lookupAndProcess.ods. (textForProcessing.odt_rep.odt)
The result also demonstrates what was meant by
Lupp wrote:The one hardly dispensable condition is: No replacement will create a new "escape sequence" not yet finally processed.

You may run the process to verify it - and you my try to port it to VBA, or to get it run by Excel. Lots of fun!
Create an empty folder load the tree files to that folder, inspect the two text files and close them again, open the spreadsheet file and rund the Sub. ...
Attachments
textForProcessing.odt_rep.odt
(27.86 KiB) Downloaded 26 times
lookupAndProcess.ods
(12.76 KiB) Downloaded 26 times
textForProcessing.odt
(27.84 KiB) Downloaded 24 times
On Windows 10: LibreOffice 6.2 and older versions, PortableOpenOffice 4.1.5 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2523
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby meltigel » Fri Jun 14, 2019 5:10 pm

I will check on your files soon, and I thank you in advance. So, you're telling me that the things that the Word Automation cannot be accomplished with Writer Automation? Of course I know that the code will be completely different... I ask because in my workplace some coworkers managed to automate Excel/Calc (import/export of some articles), with code differences... I was wondering if the same level of automation can be achieved with Word/Writer, but judging of everyone's answers, it seems no...
OpenOffice 4.1.6 on Windows 10 x64
meltigel
 
Posts: 9
Joined: Fri Jun 14, 2019 10:19 am

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby Lupp » Fri Jun 14, 2019 5:33 pm

My demo may show that AOO / LibO can do it better. I cannot know since I had no access to MS Oficce for about 15 years now. What I can tell anyway is that there will be no compatibility of automating beyond the most obvious structure of the code if structured at all a bit "top down".
On Windows 10: LibreOffice 6.2 and older versions, PortableOpenOffice 4.1.5 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2523
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: [OpenOffice][Writer]Retrieving text from a Writer docume

Postby JeJe » Fri Jun 14, 2019 5:51 pm

There seems to be some mixing up of things here? Automating OO or LO is nothing to do with VBA or Microsoft Office. Its controlling OO or LO externally...

https://www.openoffice.org/udk/common/m ... ation.html

I presume that's what you mean? You say VB but its not clear whether you mean VB6, VB.net or some other VB?

I briefly experimented with VB6 and OO automation and it seemed to work fine. You'll need to try what you're trying to do to see whether it will work or not.
Openoffice 4.1.2
Windows 8
JeJe
Volunteer
 
Posts: 551
Joined: Wed Mar 09, 2016 2:40 pm


Return to Macros and UNO API

Who is online

Users browsing this forum: Touf2638 and 3 guests