Sorting paragraphs alphabetically

Discuss the word processor
Post Reply
samkas
Posts: 1
Joined: Fri Feb 27, 2009 10:52 pm

Sorting paragraphs alphabetically

Post by samkas »

SORT in Writer
I am new to OpenOffice.org Writer. I cannot perform two SORT functions in the writer (3.0) that I can easily in Wordperfect11.
Example, the following as my dictionary text document:

Lampoon - noun
: a composition that imitates or misrepresents someone's style in a humorous way

Relucent - adjective xx
: shining; bright.

Freeload - verb
: To take advantage of the charity, generosity, or hospitality of others.


In Wordperfect I go to SORT, sort by paragraph, alphabetically, and the selection is sorted correctly as following:


Freeload - verb xx
: To take advantage of the charity, generosity, or hospitality of others.

Lampoon - noun
: a composition that imitates or misrepresents someone's style in a humorous way

Relucent - adjective xx
: shining; bright.


In Writer I get the following result .. a sort by rows

: a composition that imitates or misrepresents someone's style in a humorous way
: shining; bright.
: To take advantage of the charity, generosity, or hospitality of others.
Freeload - verb
Lampoon - noun
Relucent - adjective xx

Next, in Wordperfect I ask to select paragraphs/records globally in the document (in the same SORT menu) with xx in the first line, and I get the following correct result:


Freeload - verb xx
: To take advantage of the charity, generosity, or hospitality of others.

Relucent - adjective xx
: shining; bright.

How can I sort paragraphs in a text document and extract records in OO Writer the way I am able to do in Wordperfect .. and as simply?

Help!

Samkas

Title Edited. A descriptive title for posts helps others who are searching for solutions and increases the chances of a reply (Hagar, Moderator).
OOo 3.0.X on Ms Windows XP + wordperfect11
FJCC
Moderator
Posts: 9577
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: sort

Post by FJCC »

I can answer the first part of your post. It looks like you have paragraph marks at the end of every line. If you want to introduce a line break that does not create an new paragraph, use Shift+Enter. I tried that and your example sorted as you expected.
OpenOffice 4.1 on Windows 10 and Linux Mint
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
bluegecko
Posts: 2
Joined: Wed Jan 21, 2009 9:19 pm

Re: Sorting paragraphs alphabetically

Post by bluegecko »

Hi there

I think I might have what you're asking for the first part of your question. I finally got around to writing a macro for WordPerfect-style sorting, in which empty lines define paragraph boundaries. I've attached it to this post as an extension, as it uses a dialog (you can choose between OOo-style and WordPerfect-style sorting). Once you've installed the extension, make a keyboard or toolbar shortcut to "Main" in the "Sort" Library (I'll eventually automate this).

Alternatively, just copy the subroutines SortWP and FindReplaceString (both below) into a module, delete the oDlg.endExecute line, and point your keyboard or toolbar shortcut to the SortWP subroutine.

Code: Select all

REM  *****  BASIC  *****

  ' *****************************************************************
  '		SORT PARAGRAPHS (BY OPENOFFICE STYLE OR WORDPERFECT STYLE)
  ' 	Depends on FindReplaceString
  ' *****************************************************************

private oDlg as object		'Declare global object for dialogue

Sub Main
	DialogLibraries.LoadLibrary ("Sort")
	oDlg = CreateUnoDialog (DialogLibraries.Sort.SortDialog)
	oDlg.Execute ()					'Show dialogue
	oDlg.dispose ()					'Clean up dialogue
End Sub

Sub SortWP
	REM	sorts WordPerfect-style, where paragraphs are separated by empty lines
	oDlg.endExecute					'close dialog
	Dim oText    					'Text object for the current object
	Dim oVCursor 					'Current view cursor
	Dim oCursor  					'Text cursor

	document = ThisComponent.CurrentController.Frame
	dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")

	oVCursor = ThisComponent.getCurrentController().getViewCursor()
	oText = oVCursor.getText()
	oCursor = oText.createTextCursorByRange(oVCursor)
	
	If (Len(oCursor.getString) = 0) Then									' select all if no selection
		dispatcher.executeDispatch(document, ".uno:SelectAll", "", 0, Array())
	End if

	FindReplaceString(Chr(10),"¶¶¶","True")									' protect manual line breaks
	FindReplaceString("$",Chr(10),"True")									' change returns to line breaks
	FindReplaceString("^\n",Chr(13),"True")									' change empty lines to returns

	dispatcher.executeDispatch(document, ".uno:SortDialog", "", 0, Array())	' open sort dialogue
	FindReplaceString("\n",Chr(13),"True")									' convert line breaks to returns
	FindReplaceString("¶¶¶",Chr(10),"True")									' restore original line breaks

End Sub

Sub SortOOo
	REM	sorts Openoffice-style (each block of text is a paragraph; ingores empty lines)
	oDlg.endExecute					'close dialog

	Dim oText						'Text object for the current object
	Dim oVCursor					'Current view cursor
	Dim oCursor						'Text cursor

	document = ThisComponent.CurrentController.Frame
	dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")
	oVCursor = ThisComponent.getCurrentController().getViewCursor()
	oText = oVCursor.getText()
	oCursor = oText.createTextCursorByRange(oVCursor)
	
	If (Len(oCursor.getString) = 0) Then dispatcher.executeDispatch(document, ".uno:SelectAll", "", 0, Array())
	dispatcher.executeDispatch(document, ".uno:SortDialog", "", 0, Array())

End Sub


Sub FindReplaceString(aFind,aReplace,aRegEx)
	REM Generic search/replace routine, used here for changing line breaks in the WordPerfect routine
	REM If nothing selected, processes entire document
	oDoc = thisComponent
	oReplace = oDoc.createReplaceDescriptor
	if aRegEx="True" Then oReplace.SearchRegularExpression=True
	oReplace.setSearchString(aFind)
	oVC = oDoc.CurrentController.getViewCursor
	oFind = oDoc.Text.createTextCursorByRange(oVC.Start,false)
	oEndTC = oDoc.Text.createTextCursorByRange(oVC.End,false)
	oFind = oDoc.FindNext(oFind.End,oReplace)
	if (len(oVC.getString) = 0) Then
		oReplace.ReplaceString = aReplace
		oDoc.ReplaceAll(oReplace)
	Else
		While NOT isNull(oFind)
			oFind.String = aReplace
			oFind = oDoc.FindNext(oFind.End,oReplace)
		Wend
	End If
End Sub
Works for me, but no guarantees. In particular, it can be a bit weird if the text selection includes the very top of the document, though that might just be a bug in OOo. If anyone wants to improve on the code, hack away.

The second part of your question is tricky, as OOo can't directly sort non-adjacent paragraphs (and the Regular Expression engine sucks ostrich eggs!) Sorting all (WP-style) paragraphs containing "xx" in the first line (ie with the sorted results all together) shouldn't be impossible, if I understand what you're trying to do. Let's say we start with five paragraphs (empty lines omitted for the sake of clarity):
Zoological curiosities unicorn xx
Zoological curiosities cat
Ardent desires
Teddy bears picnic
Zoological curiosities manticora xx
Morning coffee
What's up with my mind today?
My macro above will sort these to:
Ardent desires
Morning coffee
Teddy bears picnic
What's up with my mind today?
Zoological curiosities cat
Zoological curiosities manticora xx
Zoological curiosities unicorn xx
Now, sorting only lines containing xx's, we get:
Zoological curiosities manticora xx
Zoological curiosities unicorn xx
Ardent desires
Morning coffee
Teddy bears picnics
What's up with my mind today?
Zoological curiosities cat
If that's what you're aiming for, the logic would be something like this:

1. prompt for string to search for (ie "xx")
2. temporarily convert non-empty line feeds to a code. I actually easier to use a single character than the "¶¶¶" I use in the macro above. Let's use the currency symbol (¤), which is included in ANSI but if rarely if ever used
3. run a regex search on document to select all paragraphs using something like: ^[^¤]*xx.*$
(this should find all paragraphs containing "xx" between the start of paragraphs and the first occurrence of "¤": ^ means start of line; .* any characters; $ is end of paragraph, but would have to be changed to Chr(13) in the macro, I think, to keep the paragraph breaks intact)
4. seeing as records should already be sorted if you've run my macro first, cut and paste the selected paragraphs at the top or at the end of the doc.
5. convert the temporary line feed codes back to carriage returns

If that's what you have in mind, I think I could code that easily enough (in fact, I could probably adapt my macro to do it in just one sweep). If it's something else, I daresay the solution is way too complicated for me!

Personally, I still use WordPerfect for any fancy kind of sorting (the totally excellent WordPerfect 5.1 for DOS!)

The easier solution in OOo is too use a Calc spreadsheet for sortable records, which should let you sort by specific columns.

Hope some of this helps you,
~bluegecko

Edited: included response to second part of query
Attachments
sortParagraphs.zip
(2.87 KiB) Downloaded 184 times
OOo 3.0.X on Ms Windows XP
Post Reply