Page 1 of 1

Select text from multiple text boxes to copy all at once

PostPosted: Mon Feb 12, 2018 2:58 pm
by biowfp
Hello, im using 4.1.2 OO on Win XP and have a problem with copying text.

So basically, I have lots of frames with text in them and sometimes need to copy that text into a text editor. Is there other way to do it other then to just double click every single frame, select all text and then copy it?
I also tried to select-copy boxes, but they are to big and when I try to paste them into text editor resizing them to fit a page makes it unreadable.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Mon Feb 12, 2018 4:17 pm
by Zizi64
So basically, I have lots of frames with text in them and sometimes need to copy that text into a text editor.


Is it a PDF file, opened with the PDF import function?

If the answer is YES, then you can try to convert it into .txt format by an another method, by an another software.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Mon Feb 12, 2018 4:27 pm
by Zizi64
...or you can try to write a macro to achieve this task. The macro will browse all of the textboxes in the Draw documents page by page
(I thinking about it: Is it appropriate to do it by the index of the objects??? Maybe the index order is not same as the "order of the physical position")

Re: Select text from multiple text boxes to copy all at once

PostPosted: Mon Feb 12, 2018 4:53 pm
by RoryOF
Where did the text come from? If from a file or Internet Paste, use that rather than taking the text from the secondary source in Draw.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Mon Feb 12, 2018 5:50 pm
by biowfp
Everything is made in draw - its an organizational structure. The macro-thing sounds about right, what and how will it output?

Re: Select text from multiple text boxes to copy all at once

PostPosted: Mon Feb 12, 2018 10:51 pm
by Lupp
You may play with the BASIC Sub I made some time ago when a similar question was posted in a different forum. It is contained in the attached demo. Of course, you will want to adapt the code to your needs, and place it into a module of your local Standard library, if you actually want to use it.

BTW: TextBoxes are very different from Frames in Writer documents. Draw doesn't support frames.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Tue Feb 13, 2018 1:33 pm
by keme
Did you try the PDF approach Zizi64 hinted about?
  • Export the drawing to a PDF file
  • Open that file in Adobe Reader
  • Select all, copy (ctrl+A, ctrl+C)
  • New text document, paste.
"Quick and dirty" (the resulting text is most likely a mess of jumbled words and phrases, cf. the comment from Zizi64 about ordering), but so little work that it is perhaps worth a try.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Tue Feb 13, 2018 4:51 pm
by keme
If you have PowerShell installed, you can use the following script. It pulls text from rectangle and ellipse objects and saves to a fairly tidy text file. Copy to a plaintext editor and save as "extract.ps1". Slight editing required to pull text from other object types.

Code: Select all   Expand viewCollapse view
# Powershell 2 script to extract rectangle/ellipse text from an OpenOffice Draw file
# Usage: powershell <scriptfile>.ps1 -from <drawingfile>.odg
# Saves the extracted text in file "extract.txt"

param($From)

# Split the filespec to filename and path
$Object = Split-Path $From -Leaf
$SourcePath = Split-Path $From -Parent -Resolve

# Set up temporary workspace
mkdir TempStore
$TargetPath = Join-Path $SourcePath -ChildPath TempStore -Resolve
$TargetItem = Join-Path $TargetPath -ChildPath $Object

$TemporaryTarget = Join-Path $TargetPath -ChildPath "Convert.zip"

# Move data into workspace
copy $From -Destination $TargetPath
Rename-Item $TargetItem -NewName $TemporaryTarget

Expand-Archive -Path $TemporaryTarget -DestinationPath $TargetPath -Force

# Extract the XML and pull the relevant text content from it
[xml]$DrawingData = Get-Content -Path (Join-Path $TargetPath -ChildPath "content.xml")

$DrawingData.'Document-Content'.body.drawing.page.rect.p.'#text' | out-file "extract.txt"
$DrawingData.'Document-Content'.body.drawing.page.ellipse.p.'#text' | out-file "extract.txt" -Append

# Cleanup
Remove-Item $TargetPath -Recurse -Force

Usage: C:\>powershell .\extract.ps1 -From <draw-file>
Note: This is a quick mockup using powerful commands ("remove-item ... -recurse -force" can do some damage if targeting existing content). Use with caution!

Re: Select text from multiple text boxes to copy all at once

PostPosted: Tue Feb 13, 2018 5:09 pm
by biowfp
So.

I tried the "pdf-approach" - it seems like it's ordering somehow dependent on relative "height" of the text and that turns the final text file into complete mess, as I need to save an existing vertical hierarchy, at least roughly.

I also tried the "demo-macro" from Lupp that also worked, kinda. It messed ordering in a way I cant make sense of, but not everywhere, and as I have zero coding experience I cant tell why and how to fix that. Zizi64's hint about theoretical possibility of preserving the order drives me crazy because of that same reason. Maybe there is some kind of stock for those macros?

I'd like to try your way as well, but "damage possibility" kinda wanders me off) maybe u could elaborate a bit more on it?

Re: Select text from multiple text boxes to copy all at once

PostPosted: Tue Feb 13, 2018 7:39 pm
by Lupp
Going through the graphical objects contained in (the DrawPage of) a page ("Slide") of a 'Draw' document, the objects are processed in the order they were inserted in. If this messes up the output, the original reason may lie in an unsystematic proceeding during the creation of the drawing. To get the output regarding the position in the page, it must be delayed till all the objects were inspected and the positions (Y only?) were associated with the texts (in an array e.g.).

However, this isn't everything about the issue. Textboxes and also other kinds of shapes capable of containing text, may have lots of spare area above and left of the text - and they may overlap. A grouping hierarchy must also be taken in account. To restrict the export to the objects currently selected is rather simple comparatively. But I doubt if it were clever usage of my time to first do the needed research and then to implement the exporting procedure regarding the results.
In specific the resolving of groups is an interesting aspect on the other hand...

But I surely won't work on a Q&D "solution" again. If I once should rework the code in the posted example again, I will do it as time comes by. Sorry if that's to late for you, then.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Tue Feb 13, 2018 10:29 pm
by keme
biowfp wrote:...
I'd like to try your way as well, but "damage possibility" kinda wanders me off) maybe u could elaborate a bit more on it?

As the script appears, it is safe as long as you don't have a folder named "TempStore" within the same folder where your document is, and containing important files. If you edit the script or use it as a guide for writing your own, you should remove or disable the last line (the one that begins with "Remove-Item ...") until you are certain that the variables are set correctly.

Also, the order of elements from my script will most likely be governed by the order in which they were inserted. This is probably not an improvement compared to the (perceived) "stacking order". It is possible to sort the output according to y-position, but that requires more elaborate programming. Coding and testing for this purpose will probably be more work than manual extraction, and the result will still require thorough editing.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Tue Feb 13, 2018 10:48 pm
by Villeroy
A simplified version of Lupp's macro.
It creates a new text document and generates one text table per page.
Each table has 3 columns X, Y and Text.
X is the distance from upper border.
Y is the distance from left border.
Text is the formatted text of a shape if the shape has formatted text. If a shape does not provide any formatted text, for instance the push button on my document, the corresponding table cell gets an empty text.
Nothing is saved. You just get a new Writer document with a bunch of tables.
Now you can sort the text table(s) by the X or Y values. You can edit the numbers to adjust the right order of text.
When the order is right, you can remove the 2 numeric columns and dissolve the table via menu:Table>"Table to text"

To be done by someone else: I don't understand text cursors. So I can't insert paragraph breaks between tables.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Thu Feb 15, 2018 12:21 pm
by Lupp
Of course(?), I don't "understand" TextCursors either.
But I was able to rewrite Villeroy's respective function to make it insert a paragraph above any newly inserted table for a DrawPage:
Code: Select all   Expand viewCollapse view
Function getNewWriterTable(pDoc, pCols, pRows)
  Dim theText, theTable, theTC, cPB
cPB      = com.sun.star.text.ControlCharacter.PARAGRAPH_BREAK 'Integer constant (complicated way to write 0)
theText  = pDoc.GetText
theTC    = theText.CreateTextCursorByRange(theText)
theTC.GotoEnd(False)
theText.InsertControlCharacter(theTC, cPB, False)
theTable = pDoc.CreateInstance("com.sun.star.text.TextTable")
theTable.Initialize(pRows, pCols)
theText.InsertTextContent(theTC, theTable, False)

getNewWriterTable = theTable
End Function

The reworked code is included in the new attachment. There is also demonstrated that this solution will not get the texts from grouped shapes though such a group on page3 looks exactly the same as the two ungrouped shapes on page1. (In page 2 the detection of the text entered as a kind of a label for the ball would fail if it was grouped with it as should be expected.)

I meanwhile also found time to see about a way to recursively resolve groups and multiselections from 'Draw' documents . I only did it out of principal interest, but If somebody has employment for this and gives notice of it, I will post the respective code in the 'Snippets' branch.

I did not yet find a way to access the selection of pages (slides) from the Page Panel (labelled "Pages"; only visible if enabled in 'View'). The selection may consist of a single page, but multiple selection is also supported. See also my question here: https://ask.libreoffice.org/en/question/146304/.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Thu Feb 15, 2018 12:50 pm
by biowfp
Guys you are so awesome, big thanks to everyone who replied and spent their time helping! Im gonna try the last suggestions as I get some time (busy with other work at the moment) and report probably at the start of next week.

Re: Select text from multiple text boxes to copy all at once

PostPosted: Thu Feb 15, 2018 5:20 pm
by Villeroy
With the addition by Lupp it is a easier to distinguish the separate text tables and select them with the mouse: Click near the top-left corner when the mouse cursor shows a diagonal arrow pointing to the table corner.
Without mouse: menu:Table>Select>Table
Once you have selected the entire table object, you can call the sort command in the table menu and sort by column 1 or 2 using the keytype "Numeric".

P.S. and of course you can drag the shapes in Draw so the y-order (vertical order) of the top-left corner points reflects the right text order. Then run the macro, sort by Y, remove the X and Y, dissolve the table.