[Solved] Find duplicate values in a column row

Discuss the spreadsheet application

[Solved] Find duplicate values in a column row

Postby bhcmtr » Thu May 16, 2019 2:41 pm

Hi there,
I tried Googling this (for hours) to find a pre-existing answer - however I'm afraid I still can't work this out by myself (I'm a bit of newbie to OpenOffice Calc).

I have a sheet of product data where all product attributes are stored in a single column.

I tried to detect if the row exist some duplicate text ot name. My idea is to copy some text from the website and to check it for dups.
the text is copied with the time front, which prevents correct reporting for the duplicate. it would also be able to read the names before and after the dash mark.

The column data looks like the list below:
03:21Oliveira, Charles - Lentz, Nik
03:20Oliveira, Charles - Lentz, Nik
03:10Oliveira, Charles - Lentz, Nik

the formula that I use is: =IF(COUNTIF($A$1:$A$9999;A17)>1;"Duplicate";"")
I believe it should include some exceptions witch I can't find and add to this formula.

Any tips on how to do this would be greatly appreciated!!

Thank you in advance.
Last edited by robleyd on Tue May 28, 2019 7:56 am, edited 1 time in total.
Reason: Tagged [Solved]
apache openoffice 4.1.4
bhcmtr
 
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Postby RusselB » Thu May 16, 2019 3:43 pm

Welcome to the Forums.
I'm going to suggest you start with
Code: Select all   Expand viewCollapse view
=mid(A1;6;len(a1))
in B1
Then copy that down for each row that has data.
Then use your formula in C, changing the range column from A to B

As to adding exceptions, I can't advise on that without an example of the exceptional data.
OpenOffice 4.1.6 and LibreOffice 6.0.6.2 on Windows 7 Pro & Ultimate
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
User avatar
RusselB
Moderator
 
Posts: 5398
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Postby bhcmtr » Fri May 17, 2019 11:37 am

Thank you RusselB for the reply.

It was really helpful and I going to integrate your solution into my table. Now I have to find a way how to read the text separately for dips before and after the the dash mark.

Thank you in advance for all helps.
apache openoffice 4.1.4
bhcmtr
 
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Postby robleyd » Fri May 17, 2019 11:48 am

I don't understand what you mean by "dips before and after the the dash mark"; but you can use SEARCH() to find the position of of a sub-string in a string. Then you can manipulate the string as you need with other text functions.
Cheers
David
Apache OpenOffice Developer Build 4.2.0 9820 - Slackware 14.2 - 64 bit
LibreOffice 6.0.7.3 - Slackware 14.2 - 64 bit
Apache OpenOffice 4.1.4 - Windows 7 Virtual machine
User avatar
robleyd
Moderator
 
Posts: 2936
Joined: Mon Aug 19, 2013 3:47 am
Location: Murbko, Australia

Re: Find duplicate values in a column row

Postby bhcmtr » Sat May 18, 2019 11:58 am

My idea is to find out dups information. My decision for now is to indicate and remove time frame and to separate team names.

example:
18:00FK Inter Bratislava - FK SamorinLive

Decision:
=MID(A5;6;LEN(A5)) ---> FK Inter Bratislava - FK SamorinLive

into next step I separate team names into 2 columns:
first column: =MID(F5;1;SEARCH(" - ";F5;1)) --->FK Inter Bratislava
second column: =MID(F5;SEARCH(" - ";F5);100) ---> - FK SamorinLive
and with this formula I check each column for duplicates: =IF(COUNTIF($G4:$G$10008;G4)>1;"Duplicate";"")

as you can see from the second column there is with " - " symbol and if its possible in the same row I would like to remove / hide "live" text.

Also I try to check out the result of 1`st and 2`nd row with this formula but not indicate feedback if the text is not entered manually:
IF(ISERROR(INDEX(G$4:G$99;MATCH(J4;G$4:G$99;0)));"";INDEX(G$4:G$99;MATCH(J4;G$4:G$99;0)))


Thanks in advance for any help you are able to provide.
apache openoffice 4.1.4
bhcmtr
 
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Postby keme » Wed May 22, 2019 2:11 pm

The separate data elements need to go in separate columns for this to be manageable. Splitting an input line into separate elements is one phase of the process known as "parsing".

There is no builtin function in Calc (or other spreadsheets I know of) for parsing text input. You have to go one level lower, locating the required separation between input elements and using that location as a "split point". See attached file.

Note the use of TRIM() to remove superfluous spaces.

Note also the use of grouping. A group consisting of a single column may seem silly at first. Click on the tiny buttons in the top left corner (labeled 1 and 2) to see why it may be useful ;-)
Attachments
LineParser.ods
(10.25 KiB) Downloaded 24 times
User avatar
keme
Volunteer
 
Posts: 3234
Joined: Wed Nov 28, 2007 10:27 am
Location: Egersund, Norway

Re: Find duplicate values in a column row

Postby MrProgrammer » Wed May 22, 2019 6:58 pm

keme wrote:There is no builtin function in Calc (or other spreadsheets I know of) for parsing text input
keme is correct that there is no easy way to parse using formulas, however the Data → Text to Columns feature can parse by delimiters or by position. If the time at the beginning to to be discarded, I would begin by using the Fixed With mode and setting the Column Type of the 5-character time field to Hide. Then I would use Text to Columns again in Separated By mode to split at the dash. This will leave the spaces before and after the dash in the fields. They can be removed by Edit → Find&Replace if desired though consistent use of spaces will not affect locating duplicate values.

[Tutorial] Text to Columns

If this solved your problem please go to your first post use the Edit button and add [Solved] to the start of the title. You can select the green checkmark icon at the same time.
Mr. Programmer
AOO 4.1.5 Build 9789 on MacOS 10.11.6.   The locale for any menus or Calc formulas in my posts is English (USA).
User avatar
MrProgrammer
Moderator
 
Posts: 3811
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: Find duplicate values in a column row

Postby RusselB » Thu May 23, 2019 12:48 am

While Calc doesn't have a built-in text parsing function, Quattro Pro does... or at least it used to... I don't have a current version of Quattro Pro to verify that the function still exists.
Nonetheless, based on the information given on your May 18 post, I came up with what is in my attachment.
There are two options, I personally prefer the option in the first line, even though it requires a helper column.
The option in the 2nd line is almost the same, but with the helper column requirement being eliminated by amending the formulas in C & D so that the formula in B is incorporated directly.

PS: I didn't include the formula for showing if there are duplicates or not.
Attachments
topic 98057.ods
(8.59 KiB) Downloaded 17 times
OpenOffice 4.1.6 and LibreOffice 6.0.6.2 on Windows 7 Pro & Ultimate
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
User avatar
RusselB
Moderator
 
Posts: 5398
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Postby bhcmtr » Sat May 25, 2019 5:50 pm

Thank you for all your help / decisions.

Lastly, I would be a lot thankful to decide how to compare 2 columns for dups., and the formula should read all the rows.
Unfortunately, I cannot find a solution because the result comes with a formula.

example:
Inter - PSG
Spain - Angola
Liverpool - Dinamo
Rapid - Inter
Angola - Swaziland

Here I need to detect that Inter and Angola are duplicated. Thank you in advance. I really appreciate all your help.
apache openoffice 4.1.4
bhcmtr
 
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Postby RusselB » Sat May 25, 2019 6:56 pm

In your example, are the words in two different columns or one column with the words separated by a hyphen?
If they are already in two columns, then use a 3rd & 4th helper columns with the 3rd column having the formula
Code: Select all   Expand viewCollapse view
=if(countif($A1:$B$5;A1)>1;A1;"")

Adjust the $A1 to match the column and row for your first piece of data, and $B$5 to match the column and row of your last piece.
After you have made those adjustments and entered the formula into the new correct location, you can copy & paste it.
If you don't want a formula, then you might be able to do this using a Standard Filter.
I can look into this more after I've gotten some sleep.

Another option might be to use a Conditional Format formula... Again, I need some sleep before looking into that some more,
OpenOffice 4.1.6 and LibreOffice 6.0.6.2 on Windows 7 Pro & Ultimate
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
User avatar
RusselB
Moderator
 
Posts: 5398
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Postby bhcmtr » Sat May 25, 2019 8:30 pm

Hi, I attached you my file.

Wordbook.ods
(24 KiB) Downloaded 21 times

As you can see I follow all the instruction / decisions till now, so my final question is: Also as you can see at my project file when the players names are duplicate on column "E" and on column "H" . it is possible in column "K" to be shown as duplicate?
apache openoffice 4.1.4
bhcmtr
 
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Postby RusselB » Sun May 26, 2019 2:40 am

A slight modification to your layout and your formula in E gives me a (for now) working solution with the duplicate formula in K.
I just made modifications for the date 26 May so that you can easily compare what I'm suggesting with what you had...as far as the formulas are concerned.
The change in the layout had to affect the entire spreadsheet.
Attachments
Wordbook.ods
(23.75 KiB) Downloaded 17 times
OpenOffice 4.1.6 and LibreOffice 6.0.6.2 on Windows 7 Pro & Ultimate
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
User avatar
RusselB
Moderator
 
Posts: 5398
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Postby bhcmtr » Tue May 28, 2019 7:34 am

RusselB wrote:A slight modification to your layout and your formula in E gives me a (for now) working solution with the duplicate formula in K.
I just made modifications for the date 26 May so that you can easily compare what I'm suggesting with what you had...as far as the formulas are concerned.
The change in the layout had to affect the entire spreadsheet.


Thank you so much. You were really helpful.

My drama with duplicate things is completed.
I will think about developing the table and, if I needed further help, I will open a new topic.


The post can be locked
apache openoffice 4.1.4
bhcmtr
 
Posts: 9
Joined: Thu May 16, 2019 2:25 pm


Return to Calc

Who is online

Users browsing this forum: No registered users and 23 guests