[Solved] Find duplicate values in a column row

Discuss the spreadsheet application
Post Reply
bhcmtr
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

[Solved] Find duplicate values in a column row

Post by bhcmtr »

Hi there,
I tried Googling this (for hours) to find a pre-existing answer - however I'm afraid I still can't work this out by myself (I'm a bit of newbie to OpenOffice Calc).

I have a sheet of product data where all product attributes are stored in a single column.

I tried to detect if the row exist some duplicate text ot name. My idea is to copy some text from the website and to check it for dups.
the text is copied with the time front, which prevents correct reporting for the duplicate. it would also be able to read the names before and after the dash mark.

The column data looks like the list below:
03:21Oliveira, Charles - Lentz, Nik
03:20Oliveira, Charles - Lentz, Nik
03:10Oliveira, Charles - Lentz, Nik

the formula that I use is: =IF(COUNTIF($A$1:$A$9999;A17)>1;"Duplicate";"")
I believe it should include some exceptions witch I can't find and add to this formula.

Any tips on how to do this would be greatly appreciated!!

Thank you in advance.
Last edited by robleyd on Tue May 28, 2019 7:56 am, edited 1 time in total.
Reason: Tagged [Solved]
apache openoffice 4.1.4
User avatar
RusselB
Moderator
Posts: 6646
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Post by RusselB »

Welcome to the Forums.
I'm going to suggest you start with

Code: Select all

=mid(A1;6;len(a1))
in B1
Then copy that down for each row that has data.
Then use your formula in C, changing the range column from A to B

As to adding exceptions, I can't advise on that without an example of the exceptional data.
OpenOffice 4.1.7, LibreOffice 7.0.1.2 on Windows 7 Pro, Ultimate & Windows 10 Home (2004)
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
bhcmtr
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Post by bhcmtr »

Thank you RusselB for the reply.

It was really helpful and I going to integrate your solution into my table. Now I have to find a way how to read the text separately for dips before and after the the dash mark.

Thank you in advance for all helps.
apache openoffice 4.1.4
User avatar
robleyd
Moderator
Posts: 5082
Joined: Mon Aug 19, 2013 3:47 am
Location: Murbko, Australia

Re: Find duplicate values in a column row

Post by robleyd »

I don't understand what you mean by "dips before and after the the dash mark"; but you can use SEARCH() to find the position of of a sub-string in a string. Then you can manipulate the string as you need with other text functions.
Cheers
David
OS - Slackware 15 64 bit
Apache OpenOffice 4.1.15
LibreOffice 24.2.2.2; SlackBuild for 24.2.2 by Eric Hameleers
bhcmtr
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Post by bhcmtr »

My idea is to find out dups information. My decision for now is to indicate and remove time frame and to separate team names.

example:
18:00FK Inter Bratislava - FK SamorinLive

Decision:
=MID(A5;6;LEN(A5)) ---> FK Inter Bratislava - FK SamorinLive

into next step I separate team names into 2 columns:
first column: =MID(F5;1;SEARCH(" - ";F5;1)) --->FK Inter Bratislava
second column: =MID(F5;SEARCH(" - ";F5);100) ---> - FK SamorinLive
and with this formula I check each column for duplicates: =IF(COUNTIF($G4:$G$10008;G4)>1;"Duplicate";"")

as you can see from the second column there is with " - " symbol and if its possible in the same row I would like to remove / hide "live" text.

Also I try to check out the result of 1`st and 2`nd row with this formula but not indicate feedback if the text is not entered manually:
IF(ISERROR(INDEX(G$4:G$99;MATCH(J4;G$4:G$99;0)));"";INDEX(G$4:G$99;MATCH(J4;G$4:G$99;0)))


Thanks in advance for any help you are able to provide.
apache openoffice 4.1.4
User avatar
keme
Volunteer
Posts: 3704
Joined: Wed Nov 28, 2007 10:27 am
Location: Egersund, Norway

Re: Find duplicate values in a column row

Post by keme »

The separate data elements need to go in separate columns for this to be manageable. Splitting an input line into separate elements is one phase of the process known as "parsing".

There is no builtin function in Calc (or other spreadsheets I know of) for parsing text input. You have to go one level lower, locating the required separation between input elements and using that location as a "split point". See attached file.

Note the use of TRIM() to remove superfluous spaces.

Note also the use of grouping. A group consisting of a single column may seem silly at first. Click on the tiny buttons in the top left corner (labeled 1 and 2) to see why it may be useful ;-)
Attachments
LineParser.ods
(10.25 KiB) Downloaded 164 times
User avatar
MrProgrammer
Moderator
Posts: 4905
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: Find duplicate values in a column row

Post by MrProgrammer »

keme wrote:There is no builtin function in Calc (or other spreadsheets I know of) for parsing text input
keme is correct that there is no easy way to parse using formulas, however the Data → Text to Columns feature can parse by delimiters or by position. If the time at the beginning to to be discarded, I would begin by using the Fixed With mode and setting the Column Type of the 5-character time field to Hide. Then I would use Text to Columns again in Separated By mode to split at the dash. This will leave the spaces before and after the dash in the fields. They can be removed by Edit → Find&Replace if desired though consistent use of spaces will not affect locating duplicate values.

[Tutorial] Text to Columns

If this solved your problem please go to your first post use the Edit button and add [Solved] to the start of the title. You can select the green checkmark icon at the same time.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.6.3, iMac Intel.   The locale for any menus or Calc formulas in my posts is English (USA).
User avatar
RusselB
Moderator
Posts: 6646
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Post by RusselB »

While Calc doesn't have a built-in text parsing function, Quattro Pro does... or at least it used to... I don't have a current version of Quattro Pro to verify that the function still exists.
Nonetheless, based on the information given on your May 18 post, I came up with what is in my attachment.
There are two options, I personally prefer the option in the first line, even though it requires a helper column.
The option in the 2nd line is almost the same, but with the helper column requirement being eliminated by amending the formulas in C & D so that the formula in B is incorporated directly.

PS: I didn't include the formula for showing if there are duplicates or not.
Attachments
topic 98057.ods
(8.59 KiB) Downloaded 171 times
OpenOffice 4.1.7, LibreOffice 7.0.1.2 on Windows 7 Pro, Ultimate & Windows 10 Home (2004)
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
bhcmtr
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Post by bhcmtr »

Thank you for all your help / decisions.

Lastly, I would be a lot thankful to decide how to compare 2 columns for dups., and the formula should read all the rows.
Unfortunately, I cannot find a solution because the result comes with a formula.

example:
Inter - PSG
Spain - Angola
Liverpool - Dinamo
Rapid - Inter
Angola - Swaziland

Here I need to detect that Inter and Angola are duplicated. Thank you in advance. I really appreciate all your help.
apache openoffice 4.1.4
User avatar
RusselB
Moderator
Posts: 6646
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Post by RusselB »

In your example, are the words in two different columns or one column with the words separated by a hyphen?
If they are already in two columns, then use a 3rd & 4th helper columns with the 3rd column having the formula

Code: Select all

=if(countif($A1:$B$5;A1)>1;A1;"")
Adjust the $A1 to match the column and row for your first piece of data, and $B$5 to match the column and row of your last piece.
After you have made those adjustments and entered the formula into the new correct location, you can copy & paste it.
If you don't want a formula, then you might be able to do this using a Standard Filter.
I can look into this more after I've gotten some sleep.

Another option might be to use a Conditional Format formula... Again, I need some sleep before looking into that some more,
OpenOffice 4.1.7, LibreOffice 7.0.1.2 on Windows 7 Pro, Ultimate & Windows 10 Home (2004)
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
bhcmtr
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Post by bhcmtr »

Hi, I attached you my file.
Wordbook.ods
(24 KiB) Downloaded 143 times
As you can see I follow all the instruction / decisions till now, so my final question is: Also as you can see at my project file when the players names are duplicate on column "E" and on column "H" . it is possible in column "K" to be shown as duplicate?
apache openoffice 4.1.4
User avatar
RusselB
Moderator
Posts: 6646
Joined: Fri Jan 03, 2014 7:31 am
Location: Sarnia, ON

Re: Find duplicate values in a column row

Post by RusselB »

A slight modification to your layout and your formula in E gives me a (for now) working solution with the duplicate formula in K.
I just made modifications for the date 26 May so that you can easily compare what I'm suggesting with what you had...as far as the formulas are concerned.
The change in the layout had to affect the entire spreadsheet.
Attachments
Wordbook.ods
(23.75 KiB) Downloaded 184 times
OpenOffice 4.1.7, LibreOffice 7.0.1.2 on Windows 7 Pro, Ultimate & Windows 10 Home (2004)
If you believe your problem has been resolved, please go to your first post in this topic, click the Edit button and add [Solved] to the beginning of the Subject line.
bhcmtr
Posts: 9
Joined: Thu May 16, 2019 2:25 pm

Re: Find duplicate values in a column row

Post by bhcmtr »

RusselB wrote:A slight modification to your layout and your formula in E gives me a (for now) working solution with the duplicate formula in K.
I just made modifications for the date 26 May so that you can easily compare what I'm suggesting with what you had...as far as the formulas are concerned.
The change in the layout had to affect the entire spreadsheet.
Thank you so much. You were really helpful.

My drama with duplicate things is completed.
I will think about developing the table and, if I needed further help, I will open a new topic.


The post can be locked
apache openoffice 4.1.4
Post Reply