[SOLVED] Calc: Macro for removing duplicates and sorting

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
kipana
Posts: 14
Joined: Sun Dec 15, 2019 12:43 pm

[SOLVED] Calc: Macro for removing duplicates and sorting

Post by kipana »

Hello guys,

I need help creating a macro that would sort entries by a specific column and then remove rows that contain duplicate entries in another column. From time to time, I have to process a large number of small datasets and the workflow is very similar, so I was wondering if this could be done with a macro, instead of doing it all by hand. Here is how it is usually done:

1. Create a new calc file.
2. Paste data into the file. The number of columns is always 4, the number of rows may be anything from 10 to 1000+ (an example is attached).
3. Sort everything by column C, descending order (Data->Sort / Sort by Column C, Descending)
4. Filter out duplicate entries in Column A using the Data->Filter function of Calc.

One of the quirks is that duplicate entries in column A may have different values in column C. In this scenario, I would like to keep the row that contains an entry with the largest value in column C and remove rows with other duplicates.

If anyone could help with this, it would save me tonnes of time in the future. Thank you!
Attachments
example-list.ods
(12.19 KiB) Downloaded 265 times
Last edited by robleyd on Fri May 15, 2020 1:03 am, edited 2 times in total.
Reason: Add green tick
Apache OpenOffice 4.1.1 / Windows 7
JeJe
Volunteer
Posts: 2780
Joined: Wed Mar 09, 2016 2:40 pm

Re: Calc: Macro for removing duplicates and sorting

Post by JeJe »

If you make a start then we can help if you get stuck...
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
kipana
Posts: 14
Joined: Sun Dec 15, 2019 12:43 pm

Re: Calc: Macro for removing duplicates and sorting

Post by kipana »

JeJe wrote:If you make a start then we can help if you get stuck...
I would love to, but I am very far from even basic programming and I am afraid that investing a lot of time into learning programming languages will not do much good, since I can't really apply it that frequently. I have tried using the "record micro" option in calc, but that does not seem to work for filters.
Apache OpenOffice 4.1.1 / Windows 7
JeJe
Volunteer
Posts: 2780
Joined: Wed Mar 09, 2016 2:40 pm

Re: Calc: Macro for removing duplicates and sorting

Post by JeJe »

Someone might be willing to write it all for you, who knows. I can start it. 1) is

Code: Select all


Doc = Stardesktop.loadComponentFromURL("private:factory/scalc", "_blank", 0,array())
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
RPG
Volunteer
Posts: 2250
Joined: Tue Apr 14, 2009 7:15 pm
Location: Netherlands

Re: Calc: Macro for removing duplicates and sorting

Post by RPG »

I think it is easy to do with consolidate or a pivottable. Use in both case MAX.
Attachments
example-listRPG.ods
(15.67 KiB) Downloaded 255 times
LibreOffice 7.1.4.2 on openSUSE Leap 15.2
User avatar
JohnSUN-Pensioner
Volunteer
Posts: 876
Joined: Fri Jan 14, 2011 1:21 pm
Location: Kyiv, Ukraine

Re: Calc: Macro for removing duplicates and sorting

Post by JohnSUN-Pensioner »

Paste your list from A1, press button and copy result
Attachments
sort-filter-list.ods
(21.62 KiB) Downloaded 264 times
I may not have a lot to give but what I got I'll give to you...
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
kipana
Posts: 14
Joined: Sun Dec 15, 2019 12:43 pm

Re: Calc: Macro for removing duplicates and sorting

Post by kipana »

JeJe wrote:Someone might be willing to write it all for you, who knows. I can start it. 1) is

Code: Select all


Doc = Stardesktop.loadComponentFromURL("private:factory/scalc", "_blank", 0,array())
That's what I am hoping, thank you for the part.
RPG wrote:I think it is easy to do with consolidate or a pivottable. Use in both case MAX.
Thank you for trying. I may be explaining this incorrectly, but that's not exactly the outcome I need and creating a table like that would actually require more steps. I do not need to perform any calculations on data, I just need to sort rows in the descending order, using values in column C, and, if there are duplicate entries in column 1, remove the rows with these "extra" duplicates (leaving the entry with the largest value in column C).

Here is the same example file once it has been processed, this is the outcome I am looking for.
Attachments
example-list-processed.ods
(10.84 KiB) Downloaded 220 times
Apache OpenOffice 4.1.1 / Windows 7
User avatar
JohnSUN-Pensioner
Volunteer
Posts: 876
Joined: Fri Jan 14, 2011 1:21 pm
Location: Kyiv, Ukraine

Re: Calc: Macro for removing duplicates and sorting

Post by JohnSUN-Pensioner »

You probably didn’t notice that there is an error in your algorithm? In the sample result, you have two rows of tulipanes en /state/ - duplicates, one of which must be deleted.
The fact is that you need to sort the list by two columns at once - by A, and by C
I may not have a lot to give but what I got I'll give to you...
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
kipana
Posts: 14
Joined: Sun Dec 15, 2019 12:43 pm

Re: Calc: Macro for removing duplicates and sorting

Post by kipana »

JohnSUN-Pensioner wrote:You probably didn’t notice that there is an error in your algorithm? In the sample result, you have two rows of tulipanes en /state/ - duplicates, one of which must be deleted.
The fact is that you need to sort the list by two columns at once - by A, and by C
Thank you for pointing this out, I have noticed that after I have posted the file.

The filter function in your file works like a charm, it does remove all duplicates, thank you. It does not seem to sort the results, values in Column C are presented in a seemingly random order, but I guess I can simply record another macro for that.
Apache OpenOffice 4.1.1 / Windows 7
User avatar
JohnSUN-Pensioner
Volunteer
Posts: 876
Joined: Fri Jan 14, 2011 1:21 pm
Location: Kyiv, Ukraine

Re: Calc: Macro for removing duplicates and sorting

Post by JohnSUN-Pensioner »

kipana wrote: I guess I can simply record another macro for that.
No need, it's done
Attachments
sort-filter-list-desc.ods
(17.63 KiB) Downloaded 325 times
I may not have a lot to give but what I got I'll give to you...
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
kipana
Posts: 14
Joined: Sun Dec 15, 2019 12:43 pm

Re: Calc: Macro for removing duplicates and sorting

Post by kipana »

JohnSUN-Pensioner wrote:
kipana wrote: I guess I can simply record another macro for that.
No need, it's done
Wow, you're amazing! This is exactly what I need, thank you!

Do you happen to accept donations for your work?
Apache OpenOffice 4.1.1 / Windows 7
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Calc: Macro for removing duplicates and sorting

Post by Villeroy »

This is a 100% database task and a database makes this by far easier than anything else.

Add a first row of column headers to your list and save the spreadsheet to a dedicated directory as dBase (*.dbf)
menuFile>New>Database...
[X] Connect to existing database
Type: dBase
Point to the directory
[X] Register database
Save the database document.
menu:Insert>Query(SQL View...)

Code: Select all

SELECT DISTINCT "Column A", "Column B", "Column C", "Column D" FROM "Sheet1"
Replace the quoted column names and the table name "Sheet1" wwith your actual names.
Save the query, save and close the database.
A query is like a spreadsheet formula but instead of a single value it returns a whole table
.----------------------
You can linkk this sorted, unique record set to a spreadsheet:
Hit F4 (menu:View>Data Sources) and browse your database.
Drag the query icon into a spreadsheet. The result is a linked database range which can be refreshed via menu:Data>Refresh
.----------------------
You can drag this into a Writer document as well.
.----------------------
You can edit the table right in the data source window
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
kipana
Posts: 14
Joined: Sun Dec 15, 2019 12:43 pm

Re: Calc: Macro for removing duplicates and sorting

Post by kipana »

Villeroy wrote:This is a 100% database task and a database makes this by far easier than anything else.

Add a first row of column headers to your list and save the spreadsheet to a dedicated directory as dBase (*.dbf)
menuFile>New>Database...
[X] Connect to existing database
Type: dBase
Point to the directory
[X] Register database
Save the database document.
menu:Insert>Query(SQL View...)

Code: Select all

SELECT DISTINCT "Column A", "Column B", "Column C", "Column D" FROM "Sheet1"
Replace the quoted column names and the table name "Sheet1" wwith your actual names.
Save the query, save and close the database.
A query is like a spreadsheet formula but instead of a single value it returns a whole table
.----------------------
You can linkk this sorted, unique record set to a spreadsheet:
Hit F4 (menu:View>Data Sources) and browse your database.
Drag the query icon into a spreadsheet. The result is a linked database range which can be refreshed via menu:Data>Refresh
.----------------------
You can drag this into a Writer document as well.
.----------------------
You can edit the table right in the data source window
Thank you for contributing and providing an alternative.

Being the dummy that I am, I am going to have to stick to the solution offered by JohnSUN-Pensioner, it is fully sufficient for my purposes and the amount of work that needs to be done. Nevertheless, I really appreciate everyone trying to help.
Apache OpenOffice 4.1.1 / Windows 7
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Calc: Macro for removing duplicates and sorting [SOLVED]

Post by Villeroy »

So you prefer the complicated solution which you can not maintain?
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
kipana
Posts: 14
Joined: Sun Dec 15, 2019 12:43 pm

Re: Calc: Macro for removing duplicates and sorting [SOLVED]

Post by kipana »

Villeroy wrote:So you prefer the complicated solution which you can not maintain?
I'm not here to argue or prove anything, honest. I do not store these processed files, I just extract the pieces I need to move on, so I do believe that I should be fine with the provided solution. Thanks again for trying to help.
Apache OpenOffice 4.1.1 / Windows 7
Post Reply