[SOLVED] Calc: Macro for removing duplicates and sorting
[SOLVED] Calc: Macro for removing duplicates and sorting
Hello guys,
I need help creating a macro that would sort entries by a specific column and then remove rows that contain duplicate entries in another column. From time to time, I have to process a large number of small datasets and the workflow is very similar, so I was wondering if this could be done with a macro, instead of doing it all by hand. Here is how it is usually done:
1. Create a new calc file.
2. Paste data into the file. The number of columns is always 4, the number of rows may be anything from 10 to 1000+ (an example is attached).
3. Sort everything by column C, descending order (Data->Sort / Sort by Column C, Descending)
4. Filter out duplicate entries in Column A using the Data->Filter function of Calc.
One of the quirks is that duplicate entries in column A may have different values in column C. In this scenario, I would like to keep the row that contains an entry with the largest value in column C and remove rows with other duplicates.
If anyone could help with this, it would save me tonnes of time in the future. Thank you!
I need help creating a macro that would sort entries by a specific column and then remove rows that contain duplicate entries in another column. From time to time, I have to process a large number of small datasets and the workflow is very similar, so I was wondering if this could be done with a macro, instead of doing it all by hand. Here is how it is usually done:
1. Create a new calc file.
2. Paste data into the file. The number of columns is always 4, the number of rows may be anything from 10 to 1000+ (an example is attached).
3. Sort everything by column C, descending order (Data->Sort / Sort by Column C, Descending)
4. Filter out duplicate entries in Column A using the Data->Filter function of Calc.
One of the quirks is that duplicate entries in column A may have different values in column C. In this scenario, I would like to keep the row that contains an entry with the largest value in column C and remove rows with other duplicates.
If anyone could help with this, it would save me tonnes of time in the future. Thank you!
- Attachments
-
- example-list.ods
- (12.19 KiB) Downloaded 265 times
Last edited by robleyd on Fri May 15, 2020 1:03 am, edited 2 times in total.
Reason: Add green tick
Reason: Add green tick
Apache OpenOffice 4.1.1 / Windows 7
Re: Calc: Macro for removing duplicates and sorting
If you make a start then we can help if you get stuck...
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
Re: Calc: Macro for removing duplicates and sorting
I would love to, but I am very far from even basic programming and I am afraid that investing a lot of time into learning programming languages will not do much good, since I can't really apply it that frequently. I have tried using the "record micro" option in calc, but that does not seem to work for filters.JeJe wrote:If you make a start then we can help if you get stuck...
Apache OpenOffice 4.1.1 / Windows 7
Re: Calc: Macro for removing duplicates and sorting
Someone might be willing to write it all for you, who knows. I can start it. 1) is
Code: Select all
Doc = Stardesktop.loadComponentFromURL("private:factory/scalc", "_blank", 0,array())
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
Re: Calc: Macro for removing duplicates and sorting
I think it is easy to do with consolidate or a pivottable. Use in both case MAX.
- Attachments
-
- example-listRPG.ods
- (15.67 KiB) Downloaded 255 times
LibreOffice 7.1.4.2 on openSUSE Leap 15.2
- JohnSUN-Pensioner
- Volunteer
- Posts: 876
- Joined: Fri Jan 14, 2011 1:21 pm
- Location: Kyiv, Ukraine
Re: Calc: Macro for removing duplicates and sorting
Paste your list from A1, press button and copy result
- Attachments
-
- sort-filter-list.ods
- (21.62 KiB) Downloaded 264 times
I may not have a lot to give but what I got I'll give to you...
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
Re: Calc: Macro for removing duplicates and sorting
That's what I am hoping, thank you for the part.JeJe wrote:Someone might be willing to write it all for you, who knows. I can start it. 1) is
Code: Select all
Doc = Stardesktop.loadComponentFromURL("private:factory/scalc", "_blank", 0,array())
Thank you for trying. I may be explaining this incorrectly, but that's not exactly the outcome I need and creating a table like that would actually require more steps. I do not need to perform any calculations on data, I just need to sort rows in the descending order, using values in column C, and, if there are duplicate entries in column 1, remove the rows with these "extra" duplicates (leaving the entry with the largest value in column C).RPG wrote:I think it is easy to do with consolidate or a pivottable. Use in both case MAX.
Here is the same example file once it has been processed, this is the outcome I am looking for.
- Attachments
-
- example-list-processed.ods
- (10.84 KiB) Downloaded 220 times
Apache OpenOffice 4.1.1 / Windows 7
- JohnSUN-Pensioner
- Volunteer
- Posts: 876
- Joined: Fri Jan 14, 2011 1:21 pm
- Location: Kyiv, Ukraine
Re: Calc: Macro for removing duplicates and sorting
You probably didn’t notice that there is an error in your algorithm? In the sample result, you have two rows of tulipanes en /state/ - duplicates, one of which must be deleted.
The fact is that you need to sort the list by two columns at once - by A, and by C
The fact is that you need to sort the list by two columns at once - by A, and by C
I may not have a lot to give but what I got I'll give to you...
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
Re: Calc: Macro for removing duplicates and sorting
Thank you for pointing this out, I have noticed that after I have posted the file.JohnSUN-Pensioner wrote:You probably didn’t notice that there is an error in your algorithm? In the sample result, you have two rows of tulipanes en /state/ - duplicates, one of which must be deleted.
The fact is that you need to sort the list by two columns at once - by A, and by C
The filter function in your file works like a charm, it does remove all duplicates, thank you. It does not seem to sort the results, values in Column C are presented in a seemingly random order, but I guess I can simply record another macro for that.
Apache OpenOffice 4.1.1 / Windows 7
- JohnSUN-Pensioner
- Volunteer
- Posts: 876
- Joined: Fri Jan 14, 2011 1:21 pm
- Location: Kyiv, Ukraine
Re: Calc: Macro for removing duplicates and sorting
No need, it's donekipana wrote: I guess I can simply record another macro for that.
- Attachments
-
- sort-filter-list-desc.ods
- (17.63 KiB) Downloaded 325 times
I may not have a lot to give but what I got I'll give to you...
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
Apache OpenOffice 4.1.5, LibreOffice 6.4.4.2 (x64) on Windows 7
If you think that I did not answer your question, make allowances for my imperfect English
Re: Calc: Macro for removing duplicates and sorting
Wow, you're amazing! This is exactly what I need, thank you!JohnSUN-Pensioner wrote:No need, it's donekipana wrote: I guess I can simply record another macro for that.
Do you happen to accept donations for your work?
Apache OpenOffice 4.1.1 / Windows 7
Re: Calc: Macro for removing duplicates and sorting
This is a 100% database task and a database makes this by far easier than anything else.
Add a first row of column headers to your list and save the spreadsheet to a dedicated directory as dBase (*.dbf)
menuFile>New>Database...
[X] Connect to existing database
Type: dBase
Point to the directory
[X] Register database
Save the database document.
menu:Insert>Query(SQL View...)
Replace the quoted column names and the table name "Sheet1" wwith your actual names.
Save the query, save and close the database.
A query is like a spreadsheet formula but instead of a single value it returns a whole table
.----------------------
You can linkk this sorted, unique record set to a spreadsheet:
Hit F4 (menu:View>Data Sources) and browse your database.
Drag the query icon into a spreadsheet. The result is a linked database range which can be refreshed via menu:Data>Refresh
.----------------------
You can drag this into a Writer document as well.
.----------------------
You can edit the table right in the data source window
Add a first row of column headers to your list and save the spreadsheet to a dedicated directory as dBase (*.dbf)
menuFile>New>Database...
[X] Connect to existing database
Type: dBase
Point to the directory
[X] Register database
Save the database document.
menu:Insert>Query(SQL View...)
Code: Select all
SELECT DISTINCT "Column A", "Column B", "Column C", "Column D" FROM "Sheet1"
Save the query, save and close the database.
A query is like a spreadsheet formula but instead of a single value it returns a whole table
.----------------------
You can linkk this sorted, unique record set to a spreadsheet:
Hit F4 (menu:View>Data Sources) and browse your database.
Drag the query icon into a spreadsheet. The result is a linked database range which can be refreshed via menu:Data>Refresh
.----------------------
You can drag this into a Writer document as well.
.----------------------
You can edit the table right in the data source window
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Re: Calc: Macro for removing duplicates and sorting
Thank you for contributing and providing an alternative.Villeroy wrote:This is a 100% database task and a database makes this by far easier than anything else.
Add a first row of column headers to your list and save the spreadsheet to a dedicated directory as dBase (*.dbf)
menuFile>New>Database...
[X] Connect to existing database
Type: dBase
Point to the directory
[X] Register database
Save the database document.
menu:Insert>Query(SQL View...)Replace the quoted column names and the table name "Sheet1" wwith your actual names.Code: Select all
SELECT DISTINCT "Column A", "Column B", "Column C", "Column D" FROM "Sheet1"
Save the query, save and close the database.
A query is like a spreadsheet formula but instead of a single value it returns a whole table
.----------------------
You can linkk this sorted, unique record set to a spreadsheet:
Hit F4 (menu:View>Data Sources) and browse your database.
Drag the query icon into a spreadsheet. The result is a linked database range which can be refreshed via menu:Data>Refresh
.----------------------
You can drag this into a Writer document as well.
.----------------------
You can edit the table right in the data source window
Being the dummy that I am, I am going to have to stick to the solution offered by JohnSUN-Pensioner, it is fully sufficient for my purposes and the amount of work that needs to be done. Nevertheless, I really appreciate everyone trying to help.
Apache OpenOffice 4.1.1 / Windows 7
Re: Calc: Macro for removing duplicates and sorting [SOLVED]
So you prefer the complicated solution which you can not maintain?
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Re: Calc: Macro for removing duplicates and sorting [SOLVED]
I'm not here to argue or prove anything, honest. I do not store these processed files, I just extract the pieces I need to move on, so I do believe that I should be fine with the provided solution. Thanks again for trying to help.Villeroy wrote:So you prefer the complicated solution which you can not maintain?
Apache OpenOffice 4.1.1 / Windows 7