Page 1 of 1

Remove duplicate rows in a texttable

Posted: Sat Nov 20, 2021 2:40 am
by JeJe
This macro removes duplicate rows in a texttable when the cursor is placed in the column of a texttable by which the duplicates are to be filtered. Duplicates are assumed to have the same string.

Code: Select all

Sub removeduplicateRows()
	dim vc,founddeletes as boolean,i,t,col,sts,cc as new collection,n,doc
	doc = thiscomponent
	vc = doc.currentcontroller.viewcursor
	if isempty(vc.texttable) then
		msgbox "First click in a texttable in the column you want to use to filter out duplicate rows."
		exit sub
	end if
	t = vc.texttable
	doc.lockcontrollers
	on error goto hr
	col = asc(vc.cell.cellname)-65
	n= t.rows.count -1
	for i =  n to 0 step -1
		st=t.getCellByPosition(col,i).string
		if isincollection(cc,st) = false then
			cc.add "",st
		else
			if not founddeletes then
				doc.undomanager.enterUndoContext("Delete Duplicates")
				founddeletes = true
			end if
			t.rows.removebyindex(i,1)
		end if
	next
hr:
	doc.unlockcontrollers
	if founddeletes then doc.undomanager.leaveUndoContext

End Sub

function isincollection(cc,st) as boolean
	dim a
	on error goto hr
	a =cc.item(st)
	isincollection=true
	exit function
hr:
end function


This almost identical macro removes duplicate rows in a texttable only where the strings of all the cells in the whole row are identical. Assumes a simple table.

Code: Select all


Sub removeduplicateFullRows()
	dim vc,founddeletes as boolean,i,j,t,ubcolumns,sts,cc as new collection,n,doc
	doc = thiscomponent
	vc = doc.currentcontroller.viewcursor
	if isempty(vc.texttable) then
		msgbox "First click in a texttable to remove duplicate full rows"
		exit sub
	end if
	t = vc.texttable
	doc.lockcontrollers
	on error goto hr
	n= t.rows.count -1
	ubcolumns=t.columns.count -1
	for i =  n to 0 step -1
	st = ""
	for j= 0 to ubcolumns
		st=st & t.getCellByPosition(j,i).string
	next
		if isincollection(cc,st) = false then
			cc.add "",st
		else
			if not founddeletes then
				doc.undomanager.enterUndoContext("Delete Duplicates")
				founddeletes = true
			end if
			t.rows.removebyindex(i,1)
		end if
	next
hr:
	doc.unlockcontrollers
	if founddeletes then doc.undomanager.leaveUndoContext

End Sub
Edit: not tested to destruction, always have a backup.

Re: Remove duplicate rows in a texttable

Posted: Sat Nov 20, 2021 10:45 am
by karolus
Hallo

Transition into python (the FullRow-variant) :

Code: Select all

def remove_duplicates_from_Texttable():
    doc = XSCRIPTCONTEXT.getDocument()
    sel = doc.CurrentSelection[0]
    table = sel.TextTable
    data = table.DataArray
    seen = set()
    to_delete = []
    for n, row in enumerate(data):
        if row in seen:
            to_delete.append(n)
        else:
            seen.add(row)
    rows = table.Rows
    undo = doc.getUndoManager()
    undo.enterUndoContext("remove_Rows")
    for row_index in reversed(to_delete):
        rows.removeByIndex(row_index,1)
    undo.leaveUndoContext()

Re: Remove duplicate rows in a texttable

Posted: Mon Nov 22, 2021 2:08 pm
by JeJe
Thanks karolus.

*

I've developed this a little further into a small extension
Install and then open the attached help/test file.

There a few things to consider such as:

- you might want duplicate filter cells not to be quite an exact match.
I've added match/not match case and Trim outer spaces option

- what to do with the other cells, discard the contents or append them to the one remaining result
I've added append string option and the choice of setting the separator string between them.

Expect bugs... always back up your work...