[Solved] Comparing Calc documents with different formats
-
- Posts: 4
- Joined: Tue Mar 24, 2020 2:34 pm
[Solved] Comparing Calc documents with different formats
EDIT: Thanks everyone! I've managed to solve my issues. Could never have done it without your help, so thank you very very much. Stay safe!
Hello everyone. Thank you for reading my problem, and perhaps helping me find a solution
EDIT.
In attachment there are two sample files.
What I want to do is Identify the rows in Sample Two that have a matching row in Sample One.
To do this, I need to match column C in Sample Two (It contains dossier numbers) with the numbers in Sample one, namely those found in column K.
What is the problem? The dossier number in Sample Two are written like this 'B13 18849' The dossier number in Sample One are written in another format. Namely, '106587/B131884900-10001'.
Can I make a new column in Sample Two that compares both documents, and that looks for a (partial) match? (Note that the space between 'B131' and '8849' is only present in Sample two)
Thank you.
Hello everyone. Thank you for reading my problem, and perhaps helping me find a solution
EDIT.
In attachment there are two sample files.
What I want to do is Identify the rows in Sample Two that have a matching row in Sample One.
To do this, I need to match column C in Sample Two (It contains dossier numbers) with the numbers in Sample one, namely those found in column K.
What is the problem? The dossier number in Sample Two are written like this 'B13 18849' The dossier number in Sample One are written in another format. Namely, '106587/B131884900-10001'.
Can I make a new column in Sample Two that compares both documents, and that looks for a (partial) match? (Note that the space between 'B131' and '8849' is only present in Sample two)
Thank you.
- Attachments
-
- Sample Two.ods
- (10.97 KiB) Downloaded 139 times
-
- Sample One.ods
- (10.08 KiB) Downloaded 134 times
Last edited by robleyd on Wed Mar 25, 2020 11:35 am, edited 3 times in total.
Reason: Tag as Solved
Reason: Tag as Solved
OpenOffice 4.1.3 on Windows10
Re: Comparing two documents in Calc with different fomats
Please upload two small .ods files with the values and explain exactly what you are trying to do.
Use the Upload attachment tab below where you type (128 kB max); or use a file share site, Dropbox or Google Drive for a larger file.
I don't know what you mean by " nr ".
Use the Upload attachment tab below where you type (128 kB max); or use a file share site, Dropbox or Google Drive for a larger file.
I don't know what you mean by " nr ".
LO 6.4.4.2, Windows 10 Home 64 bit
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
-
- Posts: 4
- Joined: Tue Mar 24, 2020 2:34 pm
Re: Comparing two documents in Calc with different fomats
Thank you , John. I edited the orignial question. I hope it is clearer now.
John_Ha wrote:Please upload two small .ods files with the values and explain exactly what you are trying to do.
Use the Upload attachment tab below where you type (128 kB max); or use a file share site, Dropbox or Google Drive for a larger file.
I don't know what you mean by " nr ".
OpenOffice 4.1.3 on Windows10
Re: Comparing two documents in Calc with different fomats
I can start you off.
1. It is easy (create a new column if required) to delete the space in 'B13 18849' using a Regular Expression search. Search for the first 3 characters, the space, and the last 5 characters, and write back without the space.
2. If the format of '106587/B131884900-10001' is always the same it is easy to extract the 'B1318849' as the eight characters following the / using a Regular expression. Search for any number of digits, for the /, for the next 8 characters, and for anything remaining. Write back the 8 characters.
You may need a more specific search to avoid finding other things.
You now have helper columns in each spreadsheet with the identically formatted eight characters "B1318849" in them.
See [Tutorial] How to record a macro (and Regular Expressions) for help with regular expressions.
Someone else should be able to help with the rest.
As a one time operation, copy everything into one spread sheet having added a column to identify which sheet the line is from.
Sort everything on the column with the "B1318849".
Use a helper column to show when the are duplicates - ie cell [col, n] and cell [col, n+1] are the same.
It will go wrong if your regular expression searches do not cope with all variations of the number formats.
1. It is easy (create a new column if required) to delete the space in 'B13 18849' using a Regular Expression search. Search for the first 3 characters, the space, and the last 5 characters, and write back without the space.
Code: Select all
Find (...)([:space:])(.....)
Replace $1$3
Code: Select all
Find ([:digit:]*)(/)(........)(.*)
Replace $3
You now have helper columns in each spreadsheet with the identically formatted eight characters "B1318849" in them.
See [Tutorial] How to record a macro (and Regular Expressions) for help with regular expressions.
Someone else should be able to help with the rest.
As a one time operation, copy everything into one spread sheet having added a column to identify which sheet the line is from.
Sort everything on the column with the "B1318849".
Use a helper column to show when the are duplicates - ie cell [col, n] and cell [col, n+1] are the same.
It will go wrong if your regular expression searches do not cope with all variations of the number formats.
Last edited by John_Ha on Tue Mar 24, 2020 5:47 pm, edited 1 time in total.
LO 6.4.4.2, Windows 10 Home 64 bit
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
-
- Posts: 4
- Joined: Tue Mar 24, 2020 2:34 pm
Re: Comparing two documents in Calc with different fomats
Thanks a lot John. This has helped me a lot. I hope I can figure this out quick, but I think you've helped a ton.
John_Ha wrote:I can start you off.
1. It is easy (create a new column if required) to delete the space in 'B13 18849' using a Regular Expression search. Search for the first 3 characters, the space, and the last 5 characters, and write back without the space.
2. If the format of '106587/B131884900-10001' is always the same it is easy to extract the 'B1318849' as the eight characters following the / using a Regular expression. Search for any number of digits, for the /, for the next 8 characters. Write back the 8 characters.Code: Select all
Find (...)([:space:])(.....) Replace $1$3
I have not tested the arguments above.Code: Select all
Find ([:digit:]*)(/)(........)(.*) Replace $3
You now have helper columns in each spreadsheet with the identically formatted eight characters "B1318849" in them.
See [Tutorial] How to record a macro (and Regular Expressions) for help with regular expressions.
Someone else should be able to help with the rest.
As a one time operation, copy everything into one spread sheet having added a column to identify which sheet the line is from.
Sort everything on the column with the "B1318849".
Use a helper column to show when the are duplicates - ie cell [col, n] and cell [col, n+1] are the same.
It will go wrong if your regular expression searches do not cope with all variations of the number formats.
OpenOffice 4.1.3 on Windows10
Re: Comparing two documents in Calc with different fomats
I put 154557/B0711671-10062 in cell G4 of "Sample One.ods" and turned off "match whole cells" in the calculation options of "Sample Two.ods"
Then I tried this in "Sample Two.ods":
Instead of the expected value 4 this formula returns the no match error #NA.
After copying the source column from "Sample One" to "Sample Two".Sheet2, the following formula does return the expected value 4.
If this is a bug, it affects OpenOffice4 and LibreOffice6.
Then I tried this in "Sample Two.ods":
Code: Select all
=MATCH("/"&SUBSTITUTE($C3;" ";"")&"-";'file:///tmp/Sample One.ods'#$Sheet1.$G$1:$G$7;0)
After copying the source column from "Sample One" to "Sample Two".Sheet2, the following formula does return the expected value 4.
Code: Select all
=MATCH("/"&SUBSTITUTE($C3;" ";"")&"-";$Sheet2.$G$1:$G$7;0)
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Re: Comparing two documents in Calc with different fomats
I told you someone (with better knowledge) would come along.
I only have a small brain and Villeroy's formula is too hard for me to get my mind round
I prefer lots of small steps because even my little brain knows it does what I think it does.
Some other searches which work are below. I copied the data into a helper (coloured) column. I highlighted the data and then Edit > Find and Replace ..., in the Current Selection which searched only the highlighted data.
Incidentally, your business is breaking the most fundamental rule of data management which is "A piece of data should be stored in only one place". If the same piece of data is stored in more than one place they will ALWAYS become different over time and you will not know which is correct. Businesses should always use a database and not individual spreadsheets to manage the business. It's why Larry Ellisson is a multi-billionaire.
I only have a small brain and Villeroy's formula is too hard for me to get my mind round
I prefer lots of small steps because even my little brain knows it does what I think it does.
Some other searches which work are below. I copied the data into a helper (coloured) column. I highlighted the data and then Edit > Find and Replace ..., in the Current Selection which searched only the highlighted data.
Code: Select all
(.*)( )(.*)
$1$3
([:digit:]{6})(/)([:alpha:][:digit:]{7})(.*)
$3
- Attachments
-
- Sample One jh.ods
- (13.78 KiB) Downloaded 127 times
-
- Sample Two jh.ods
- (14.89 KiB) Downloaded 126 times
LO 6.4.4.2, Windows 10 Home 64 bit
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Re: Comparing two documents in Calc with different fomats
These data had been extracted from some kind of database anyway. I doubt that anyone typed all this into spreadsheets. As soon as your data leave the business database they become very difficult to compare, let alone when they come from different databases. Everything in one place and that place needs to be some company database.John_Ha wrote: Businesses should always use a database and not individual spreadsheets to manage the business. It's why Larry Ellisson is a multi-billionaire.
My formula is just a simple MATCH with a link to a column in another file.
=MATCH(text ; column ; 0)
and the lookup text is a preceeding "/" the cell string without space and a trailing "-": "/B0711671-". The formula works fine when the search column is copied into the same document but fails with an external reference.
If the MATCH works, you get any other column value from the other file with =INDEX(other_column ; MATCH(text ; column ; 0))
Oh, and this does work:
Code: Select all
=INDEX('file:///tmp/Sample One.ods'#$Sheet1.$A$1:$A$7 ; MATCH("/"&SUBSTITUTE($C3;" ";"")&"-";$Sheet2.$G$1:$G$7;0))
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
-
- Posts: 4
- Joined: Tue Mar 24, 2020 2:34 pm
Re: Comparing two documents in Calc with different fomats
Thanks for the answer. I can't get it to work, but I'll try and figure it out.
Villeroy wrote:I put 154557/B0711671-10062 in cell G4 of "Sample One.ods" and turned off "match whole cells" in the calculation options of "Sample Two.ods"
Then I tried this in "Sample Two.ods":Instead of the expected value 4 this formula returns the no match error #NA.Code: Select all
=MATCH("/"&SUBSTITUTE($C3;" ";"")&"-";'file:///tmp/Sample One.ods'#$Sheet1.$G$1:$G$7;0)
After copying the source column from "Sample One" to "Sample Two".Sheet2, the following formula does return the expected value 4.If this is a bug, it affects OpenOffice4 and LibreOffice6.Code: Select all
=MATCH("/"&SUBSTITUTE($C3;" ";"")&"-";$Sheet2.$G$1:$G$7;0)
OpenOffice 4.1.3 on Windows10
Re: Comparing two documents in Calc with different fomats
Save the 2 files to the same directory and open "Sample Two.ods"
- Attachments
-
- Sample Two.ods
- (15.14 KiB) Downloaded 134 times
-
- Sample One.ods
- (18.11 KiB) Downloaded 139 times
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice