[Solved] Cycle through Writer document, changing all page styles

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

[Solved] Cycle through Writer document, changing all page styles

Post by paul1149 »

I have an .odt doc that was converted from .pdf. Consequently, each page has its own style, named sequentially "Converted#", where # increments with each page. (All those page styles really bog down document loading also.)

I want to convert all page styles to my own "Beige_Default", which I have imported into the document.

Currently I have pieced together the following macro. In the past it has worked occasionally, which I don't understand because it's not working presently.

Code: Select all

sub Page_styles_convert1

dim document   as object
dim dispatcher as object

document   = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")

dim args1(1) as new com.sun.star.beans.PropertyValue
args1(0).Name = "Template"
args1(0).Value = "Beige_Default"
args1(1).Name = "Family"
args1(1).Value = 8

For i = 0 To 276
	dispatcher.executeDispatch(document, ".uno:StyleApply", "", 0, args1())
	dispatcher.executeDispatch(document, ".uno:GoToNextPage", "", 0, Array())
Next

end sub
The macro does move through the doc, but nothing is changed. Any help would be appreciated.

I'm on LO 7.6.4.1

Thanks
Last edited by Hagar Delest on Thu Dec 28, 2023 11:03 pm, edited 1 time in total.
Reason: tagged solved.
LibreOffice 7.6.4.1
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

You could open the file as a zip archive, then extract content.xml and use a plain text editor to replace all instances of Converted??? with Beige_Default. Then Save content.xml and reinsert it into the archive.

As this process could be catastrophic if an error is made, it is best (read: ESSENTIAL) to work on a copy of the OpenOffice file.

The exact syntax of Converted??? will depend on the detail of content.xml; you may need to use a plain text editor that can handle wild cards or regular expressions.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

That's clever, and it seems it would work. Would that also delete the "ConvertedX" page styles from the document, something that I have been doing manually after the conversion is effected? I have doubts about that, but it would be a plus.

I will revert to this method if necessary, but I would still like to do this within the GUI using a macro.

Thanks much.
LibreOffice 7.6.4.1
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

I think it might rename the ConvertedX style definitions all to the Beige_Default style definition, giving multiple instances of that definition. How OpenOffice will react to that, I cannot say.

If the edit was manually controlled, the ConvertedX style definitions could be left unaltered; they would not be used in the layout of the file and might only harmlessly bulk up content.xml
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

paul1149 wrote: Wed Dec 27, 2023 5:28 pm I have an .odt doc that was converted from .pdf. Consequently, each page has its own style, named sequentially "Converted#", where # increments with each page. (All those page styles really bog down document loading also.)

I want to convert all page styles to my own "Beige_Default", which I have imported into the document.
That conversion might be to a format called hOCR, which attempts to preserve the original formatting. Converting book text, without tables etc, I normally set my OCR program to produce plain text output, which is easily reformatted to my requirements in OpenOffice, without need to lose now redundant formatting.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

I'm wondering if the sporadic results from my macro have something to do with the size of the doc. Currently I have content.xml open in kate, on linux/kde, and after I temporarily removed the linelength restriction and it reloaded, it's pinned the CPU at 9% for ten minutes now, with no sign of abating. The doc is 280 pages, and opening it in LO does the same to CPU for about 4 minutes. Maybe the macro is too demanding for the system - which, BTW, is a quite competent Ryzen.

That 9% appears to be 100% of a cpu core, and performance is limited to that core, because kate is being extremely sluggish.
LibreOffice 7.6.4.1
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Cycle through Writer document, changing all page styles

Post by JeJe »

What am I missing here... why don't you just select the whole document and apply the page style to it?
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

That doesn't work either.
LibreOffice 7.6.4.1
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Cycle through Writer document, changing all page styles

Post by JeJe »

Should do. Can you select the whole document? What happens when you click to apply the page style?
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

All the text is selected, but then Nothing happens.

Meanwhile geanie opens the document without any delay, so I'm trying to work there.
LibreOffice 7.6.4.1
FJCC
Moderator
Posts: 9284
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: Cycle through Writer document, changing all page styles

Post by FJCC »

I copied this macro from Andrew Pitonyak's Useful Macro Information, section 7.8.1. It removes page breaks and their associated change in page style. It might mess up your formatting, so try it on a copy of your document.

Code: Select all

Sub FindPageBreaks
REM Author: Andrew Pitonyak
Dim iCnt As Long
Dim oCursor as Variant
Dim oText As Variant
Dim s As String
oText = ThisComponent.Text
oCursor = oText.CreateTextCursor()
oCursor.GoToStart(False)
Do
If NOT oCursor.gotoEndOfParagraph(True) Then Exit Do
iCnt = iCnt + 1
If NOT IsEmpty(oCursor.PageDescName) Then
s = s & "Paragraph " & iCnt & " has a new page to style " & _
oCursor.PageDescName & CHR$(10)
oCursor.PageDescName = ""
End If
If oCursor.BreakType <> com.sun.star.style.BreakType.NONE Then
s = s & "Paragraph " & iCnt & " has a page break" & CHR$(10)
oCursor.BreakType = com.sun.star.style.BreakType.NONE
End If
Loop Until NOT oCursor.gotoNextParagraph(False)
MsgBox s
End Sub
[/code
OpenOffice 4.1 on Windows 10 and Linux Mint
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Cycle through Writer document, changing all page styles

Post by JeJe »

you can dispose the page styles you don't want. Testing with applying one style and disposing it I got a conversion to default page style along with the removal of the style.

Edit: back up first of course.

Code: Select all

for i = 1 to 280
 thiscomponent.stylefamilies.getbyname("PageStyles").removebyname("Converted" & i)
 next
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

paul1149 wrote: Wed Dec 27, 2023 6:17 pm I'm wondering if the sporadic results from my macro have something to do with the size of the doc. Currently I have content.xml open in kate, on linux/kde
...

That 9% appears to be 100% of a cpu core, and performance is limited to that core, because kate is being extremely sluggish.
I can't answer for kate, but I know that OpenOffice is not multicored - reformatting page styles on a large file can take "forever", one core fully occupied, with little touches to another core.

If you have the original PDF, you could try a fresh OCR; on linux (xubuntu 22.04.3) I use gimagereader-qt as a frontend to Tesseract; this produces plain text with a high accuracy, at a rate of perhaps 8-10 pages per minute.

{Note: I found gimagereader-Gtk was very slow on my system, the Qt version was not; I didn't investigate why]

Some PDF files have the text embedded; MrProgrammer has given some help to extract such text; his posting is at
viewtopic.php?p=410366#p410366
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

Ok, this method works:

I opened content.xml in geany. I did the regex substitution for all instances of the Converted# style. I saved the file and updated the archive.

When I opened the doc in LO, it again took some time to process, but not as much as before. This told me that the style change had probably been effected, but the Converted# styles were still present. Which makes sense, since a style doesn't have to be used to be present in a doc.

And that indeed was the case. Now, to get rid of the old styles was simply a matter of mass-selecting them in Styles Inspector, and Deleting.

So this is a doable way to get this done. And I thank you again for the idea. On large files this could be better than via a LO macro because it circumvents most of the long loading wait times. For lesser docs, maybe the macro would work, as it has at times in the past.
LibreOffice 7.6.4.1
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

If the book is out of copyright (and sometimes not even), it may be possible to find an ePub version of it, which can be opened to give plain text, ready for reformatting.

The downloadable program Calibre will readily convert a PDF or ePub file to (inter alia) TXT , DOC, or ODT
One may need to change the default configuration to add some specific conversions.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

Jeje - Thanks. Deleting from within Styles Inspector is easy once the styles are no longer in use. But I will keep this for a rainy day.

FJCC - thanks. That could come in handy for some plain documents, but on others I'd like to retain some of the formatting. I will save it for possible use later.

Rory - LO has a multi-threading option under its Calc module, but I don't know if it affects Writer; apparently not, because that 9%/100% limitation holds.

The rescan is a thought. Also, kde's okular has a text extraction function. It works well, but all formatting is lost. I am going to book mark your post.


---

In all this no one faulted the macro I've been using, which leads me to think that the core overload hypothesis might be the explanation of what's going on.
LibreOffice 7.6.4.1
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Cycle through Writer document, changing all page styles

Post by JeJe »

You could try locking controllers so the screen doesn't update till the macro is finished.
And/or you could try adding a wait statement to give each operation in the loop time to complete before continuing.

Code: Select all

sub Page_styles_convert1

dim document   as object
dim dispatcher as object

document   = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")

dim args1(1) as new com.sun.star.beans.PropertyValue
args1(0).Name = "Template"
args1(0).Value = "Beige_Default"
args1(1).Name = "Family"
args1(1).Value = 8
on error goto hr
thiscomponent.lockcontrollers

For i = 0 To 276
	dispatcher.executeDispatch(document, ".uno:StyleApply", "", 0, args1())
	dispatcher.executeDispatch(document, ".uno:GoToNextPage", "", 0, Array())
	wait 1000 'wait a second
Next
hr:
thiscomponent.unlockcontrollers

end sub
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Cycle through Writer document, changing all page styles

Post by JeJe »

What happens with dispatch calls is they tend just to not work without giving an error message or stopping code execution.

You can put a return in and examine it to see whether the call succeeded or not.

Code: Select all

ret =dispatcher.executeDispatch(document, ".uno:StyleApply", "", 0, args1())
Edit: I presume you ran that on one page first to see if it works before putting it in the loop.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

For less experienced Users:

It cannot be stressed too heavily that external processing of content.xml, or indeed any file in the Open-/Libre-Office archive, can be fatal if errors are made. Any such work should [read: MUST] be done on a copy of the original, so that there is a fall-back position if disaster strikes.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

Thanks. Great ideas. I do use lock controllers on some other big macros. This would be a good place for it too.

I did try the dispatcher statement without the looping. While in the past it worked, often even with looping, this time it didn't work on the single page. Which is why I've been so baffled. When I did this on a different doc two days ago, it also did not work. So I opened the doc again yesterday morning preparing to deal with the problem, only to find that now the pages had been changed. It was then that I began to suspect the core overload problem. I also tried closing/reopening on today's doc, and this time it didn't work. Today's doc is larger than the previous one.
LibreOffice 7.6.4.1
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Cycle through Writer document, changing all page styles

Post by JeJe »

If its a big document you need to give it time to load.
Sometimes dispatch calls won't work unless run with the document having the focus eg Tools menu/macros/run macro and not from the IDE.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

That's interesting, I didn't know that. I'll try that next time.

In the past, converted documents oftentimes continued to be sluggish. Then I discovered that deleting the abandoned styles took care of that. But today's document remains sluggish even after cleanup. Doing anything in the doc sends the cpu to 9%. where it stays for 15s. Maybe further cleanup will help.
LibreOffice 7.6.4.1
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

Roughly how many words and pages?
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

227 pages, 455k words, 626kb.

---

I just cleaned up a lot of hidden frames, it's down to 622kb, and it's quite a bit better now. LO is sensitive to extra formatting I find.
LibreOffice 7.6.4.1
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

I removed all "accidental" (in situ) formatting, and that has improved performance immensely. That seems to be a big drag on the LO engine.
LibreOffice 7.6.4.1
User avatar
RoryOF
Moderator
Posts: 34619
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Cycle through Writer document, changing all page styles

Post by RoryOF »

The file size, in terms of words and pages, is not a problem; as a rough rule of thumb OpenOffice can easily handle 1000 pages or more of plain text. Tolstoy's "War and Peace", which I use as a trial large file (I have larger) is 550,000+ words and 2000+ A5 pages and is easily editable to change layout.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Cycle through Writer document, changing all page styles

Post by JeJe »

When you created the "Beige_Default" page style is its Next Style property also "Beige_Default"?

Perhaps that or something similar is amiss - this shouldn't be a macro problem: selecting all the pages and applying the page style should work.

Edit: if the document is opened read only nothing will happen.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
Zizi64
Volunteer
Posts: 11364
Joined: Wed May 26, 2010 7:55 am
Location: Budapest, Hungary

Re: Cycle through Writer document, changing all page styles

Post by Zizi64 »

227 pages, 455k words, 626kb.
Please create a copy of your document, delete most of pages and upload a few page sample file here.
Tibor Kovacs, Hungary; LO7.5.8 /Win7-10 x64Prof.
PortableApps/winPenPack: LO3.3.0-7.6.2;AOO4.1.14
Please, edit the initial post in the topic: add the word [Solved] at the beginning of the subject line - if your problem has been solved.
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

Yes, the next style is Beige_Default. It's meant to be for the main body of a document.
LibreOffice 7.6.4.1
paul1149
Posts: 29
Joined: Mon Jun 27, 2016 12:56 am

Re: Cycle through Writer document, changing all page styles

Post by paul1149 »

Also, a bit of document history, that may be pertinent. The original was .pdf. I converted it online to .docx. Then I used LO to save it as .odt. I imagine that would complicate the formatting and slow performance.
LibreOffice 7.6.4.1
Post Reply