Remove duplicate rows in a texttable

Shared Libraries
Forum rules
For sharing working examples of macros / scripts. These can be in any script language supported by OpenOffice.org [Basic, Python, Netbean] or as source code files in Java or C# even - but requires the actual source code listing. This section is not for asking questions about writing your own macros.
Post Reply
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Remove duplicate rows in a texttable

Post by JeJe »

This macro removes duplicate rows in a texttable when the cursor is placed in the column of a texttable by which the duplicates are to be filtered. Duplicates are assumed to have the same string.

Code: Select all

Sub removeduplicateRows()
	dim vc,founddeletes as boolean,i,t,col,sts,cc as new collection,n,doc
	doc = thiscomponent
	vc = doc.currentcontroller.viewcursor
	if isempty(vc.texttable) then
		msgbox "First click in a texttable in the column you want to use to filter out duplicate rows."
		exit sub
	end if
	t = vc.texttable
	doc.lockcontrollers
	on error goto hr
	col = asc(vc.cell.cellname)-65
	n= t.rows.count -1
	for i =  n to 0 step -1
		st=t.getCellByPosition(col,i).string
		if isincollection(cc,st) = false then
			cc.add "",st
		else
			if not founddeletes then
				doc.undomanager.enterUndoContext("Delete Duplicates")
				founddeletes = true
			end if
			t.rows.removebyindex(i,1)
		end if
	next
hr:
	doc.unlockcontrollers
	if founddeletes then doc.undomanager.leaveUndoContext

End Sub

function isincollection(cc,st) as boolean
	dim a
	on error goto hr
	a =cc.item(st)
	isincollection=true
	exit function
hr:
end function


This almost identical macro removes duplicate rows in a texttable only where the strings of all the cells in the whole row are identical. Assumes a simple table.

Code: Select all


Sub removeduplicateFullRows()
	dim vc,founddeletes as boolean,i,j,t,ubcolumns,sts,cc as new collection,n,doc
	doc = thiscomponent
	vc = doc.currentcontroller.viewcursor
	if isempty(vc.texttable) then
		msgbox "First click in a texttable to remove duplicate full rows"
		exit sub
	end if
	t = vc.texttable
	doc.lockcontrollers
	on error goto hr
	n= t.rows.count -1
	ubcolumns=t.columns.count -1
	for i =  n to 0 step -1
	st = ""
	for j= 0 to ubcolumns
		st=st & t.getCellByPosition(j,i).string
	next
		if isincollection(cc,st) = false then
			cc.add "",st
		else
			if not founddeletes then
				doc.undomanager.enterUndoContext("Delete Duplicates")
				founddeletes = true
			end if
			t.rows.removebyindex(i,1)
		end if
	next
hr:
	doc.unlockcontrollers
	if founddeletes then doc.undomanager.leaveUndoContext

End Sub
Edit: not tested to destruction, always have a backup.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
karolus
Volunteer
Posts: 1160
Joined: Sat Jul 02, 2011 9:47 am

Re: Remove duplicate rows in a texttable

Post by karolus »

Hallo

Transition into python (the FullRow-variant) :

Code: Select all

def remove_duplicates_from_Texttable():
    doc = XSCRIPTCONTEXT.getDocument()
    sel = doc.CurrentSelection[0]
    table = sel.TextTable
    data = table.DataArray
    seen = set()
    to_delete = []
    for n, row in enumerate(data):
        if row in seen:
            to_delete.append(n)
        else:
            seen.add(row)
    rows = table.Rows
    undo = doc.getUndoManager()
    undo.enterUndoContext("remove_Rows")
    for row_index in reversed(to_delete):
        rows.removeByIndex(row_index,1)
    undo.leaveUndoContext()
AOO4, Libreoffice 6.1 on Rasbian OS (on ARM)
Libreoffice 7.4 on Debian 12 (Bookworm) (on RaspberryPI4)
Libreoffice 7.6 flatpak on Debian 12 (Bookworm) (on RaspberryPI4)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Remove duplicate rows in a texttable

Post by JeJe »

Thanks karolus.

*

I've developed this a little further into a small extension
Install and then open the attached help/test file.

There a few things to consider such as:

- you might want duplicate filter cells not to be quite an exact match.
I've added match/not match case and Trim outer spaces option

- what to do with the other cells, discard the contents or append them to the one remaining result
I've added append string option and the choice of setting the separator string between them.

Expect bugs... always back up your work...
Attachments
JeTextTable.oxt
(7.71 KiB) Downloaded 385 times
remove duplicate test document.odt
(39.2 KiB) Downloaded 377 times
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
Post Reply