[Solved] Query and store results from multiple spreadsheets

Discuss the spreadsheet application

[Solved] Query and store results from multiple spreadsheets

Postby peace2 » Sat Feb 15, 2020 9:00 pm

Hello out there!

I have 10.000+ calc spreadsheets, each in a single file, with almost 99% identical structure and would like to query on single fields of each spreadsheet.

Basic structure of each spreadsheet/file:
A1: product ID; A2: product123
. rows A2
. till
. A5 not important for the query.
A6: sales 2012; B6: 123.00
A7: sales 2013; B7: 456.00
A8: sales 2014; B8: 789.00

One sample query would be: which products had sales of between 250 and 350 in year 2012?
Another would be: which products had sales of over 450 in year 2014?

Any advice on these issues is highly appreciated!

peace
Last edited by peace2 on Tue Feb 18, 2020 8:07 pm, edited 1 time in total.
openoffice 4.1.6
windows
peace2
 
Posts: 7
Joined: Sat Feb 15, 2020 8:03 pm

Re: Query and store results from multiple spreadsheets

Postby jrkrideau » Sat Feb 15, 2020 10:46 pm

peace2 wrote:Hello out there!

I have 10.000+ calc spreadsheets, each in a single file, with almost 99% identical structure and would like to query on single fields of each spreadsheet.


The first question is just how close to 100% is your 99% identical structure
Second, what are the differences?

How many rows (approximately) per spreadsheet?

Where do these sheets come from? Are they output form another program or hand entered?

How are the sheets named? Is there a regular pattern to the names.

Can you upload a couple of sample spreadsheets for us to examine. I think we need more than one to see if there are any significant differences in formatting.

There is an upload tab at the bottom of the Quote and the Reply windows
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Query and store results from multiple spreadsheets

Postby peace2 » Sun Feb 16, 2020 12:36 pm

In reply to your questions:

"The first question is just how close to 100% is your 99% identical structure?", and "...differences?".
Very close. There may be a row missing an entry at A5, how many I don't know.

"...rows per spreadsheet?"
Maximum of 45 rows per spreadsheet.

" ... spreadsheet origin, output ...?
Data was collected on a server probably running an SQL database. Data was extracted using C++ (probably) and then used to populate the spreadsheets in Excel (.xlsx). As I don't use Excel, the files are in their original state locally and opened with OpenOffice.

"...sheet names and patterns..."
Yes, please refer to the uploads attached here. Patterns are obvious. Especially, field value of B1 is identical to the file name, and is unique.

The sheets uploaded here, 3 each, have been revised to protect the delicate nature of the data. However, nomenclature is stringent.

Thank you very much for your attention to this matter!

peace
openoffice 4.1.6
windows
peace2
 
Posts: 7
Joined: Sat Feb 15, 2020 8:03 pm

Re: Query and store results from multiple spreadsheets

Postby peace2 » Sun Feb 16, 2020 12:39 pm

Here the samples...
Attachments
AB_2013_00002_456789123a1bc2.xls
(8.5 KiB) Downloaded 21 times
AB_2013_00001_1234567890b1ab2.xls
(32.5 KiB) Downloaded 20 times
AB_2014_00002_789123456b2cg4.xls
(8.5 KiB) Downloaded 18 times
openoffice 4.1.6
windows
peace2
 
Posts: 7
Joined: Sat Feb 15, 2020 8:03 pm

Re: Query and store results from multiple spreadsheets

Postby jrkrideau » Sun Feb 16, 2020 2:34 pm

Thanks. This is great.

I am not exactly sure what to do at the moment but this gives the Calc gurus a lot to work with. Am I correct in assuming that the data of interest is in that table from B12 to D44 and that the rest of the information can be ignored for this exercise?
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Query and store results from multiple spreadsheets

Postby peace2 » Sun Feb 16, 2020 2:56 pm

Thanks for your question. To clarify further:

Data range B12 to D44 is not of interest here. The queries I would like to run concern, first and foremost, fields B6, B7 and B8. I.e., which products (Ref-ID) had sales of above 2.500 in year 2013?

Or: which products (Ref-ID) had sales of less than 1.000 in year 2012?

Thanks,
peace
openoffice 4.1.6
windows
peace2
 
Posts: 7
Joined: Sat Feb 15, 2020 8:03 pm

Re: Query and store results from multiple spreadsheets

Postby Villeroy » Sun Feb 16, 2020 4:24 pm

This is not possible if you split tiny pieces of information across different documents.
These documents look as if they aggregate some data that used to be stored somewhere else, e.g. in the URLs. IOW, these sheets look like database reports or something similar. You did not type these data into the sheets, right? In order to get better reports, you have to use the resource that produces the reports.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04, no OpenOffice, LibreOffice 6.4
User avatar
Villeroy
Volunteer
 
Posts: 28573
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Query and store results from multiple spreadsheets

Postby jrkrideau » Sun Feb 16, 2020 5:33 pm

I think Villeroy is correct. It is just not something a spreadsheet is designed to do. It probably is doable in another language. It looks like one should be able to do it in Python and seems reasonably easy for an experienced R user.

Are you stuck using Calc or another spreadsheet?
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Query and store results from multiple spreadsheets

Postby peace2 » Sun Feb 16, 2020 7:50 pm

Thank you both very much!

To Villeroy: The data was not manually entered into the spreadsheets. As described above, the spreadsheet population of the data results (probably) through some SW that accessed the original database and produced Excel file output (possibly as exported db reports) for each database record.

To jrkrideau: I am not fluent in applying C+ or Python etc. Nor am I stuck with Calc or other spreadsheet SW. However, a little experience with databases. So could the following achieve the task?

1. Create a database (preferably OpenOffice)
2. Import selected data (fields B1 as unique key, B6, B7 and B8) from each single file to the DB, each as an individual record. This needs to be automatic, given the number of files (10.000+).
3. Apply the internal db query function
openoffice 4.1.6
windows
peace2
 
Posts: 7
Joined: Sat Feb 15, 2020 8:03 pm

Re: Query and store results from multiple spreadsheets

Postby jrkrideau » Sun Feb 16, 2020 8:43 pm

peace2 wrote:Thank you both very much!
. However, a little experience with databases. So could the following achieve the task?

1. Create a database (preferably OpenOffice)
2. Import selected data (fields B1 as unique key, B6, B7 and B8) from each single file to the DB, each as an individual record. This needs to be automatic, given the number of files (10.000+).
3. Apply the internal db query function


This is way outside my area of expertise. Villeroy should be able to make some good comments.
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Query and store results from multiple spreadsheets

Postby Lupp » Sun Feb 16, 2020 9:46 pm

As soon as the data are collected in a spreadsheet, there should be means to do the "query". COUNTIFS(), SUMPRODUCT() and probably a PivotTable will help.
Try the attached wizard file. It's a spreadsheet document only serving as a container for user code and some short explanations.

(Spreadsheets aren't actually the appropriate means to perform a query here. However, AOO or LibO can at least open the silly .xls files which were created by the original misuse of spreadsheets. I won't write a program to directly extract data from .xls.)

Editing: I missed to mention that the macro 'processTheMess' is what you need to run. The other macro is run onDocumentViewCreated.
Editing again: In a test with 50 source files the Sub needed about 360 ms per file on my system.
For the "10000+" files it may need an hour(?). This is not (or only by a marginal amount) caused by "poor Basic". To open a spreadsheet file and to convert it from xls storage to running representation in AOO/LibO needs some time unavoidably.
Attachments
converter.ods
(11.95 KiB) Downloaded 21 times
On Windows 10: LibreOffice 7.0 and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2918
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: Query and store results from multiple spreadsheets

Postby MrProgrammer » Sun Feb 16, 2020 10:18 pm

Hi, and welcome to the forum. You have posted in the Calc forum, however I do not believe that Calc is a good tool to use for this situation.

peace2 wrote:I have 10.000+ calc spreadsheets
peace2 wrote:Data was extracted using C++ (probably) and then used to populate the spreadsheets in Excel (.xlsx)
No, you have many Excel-format spreadsheets, which were created using some non-standard process which seems to have used an unknown version or versions of Excel. If the files are .XLSX format, why do you attach .XLS format files in your post? The two formats are quite different.

peace2 wrote:As I don't use Excel …
Well, for this project you should use it, since OpenOffice/LibreOffice is not a clone of Excel and could introduce errors, even if this seems unlikely. Surely this project is for a company, and they must hold an Excel license, and they do not want errors. If they've selected you for this work, they should supply a copy of Excel for you. If they are reluctant, you can explain that a non-Excel solution introduces additional, unknown risk. Then you have a defense if problems arise.

peace2 wrote:Any advice on these issues is highly appreciated!
• My advice is to use Excel, and only Excel, to convert these files from spreadsheet format, either .XLS or .XLSX whatever you have, to Comma Separated Value format .CSV using Excel macros and Windows scripting. For details about that, consult an Excel forum. I am confident that others have done this and documented the method.
• If you cannot or will not use Excel, a web search will no doubt find non-Microsnot tools which can convert these spreadsheets to .CSV files and you can call the tool in a script to convert the thousands of files. You want a non-interactive tool which you install on your computer like soffice --headless --convert-to, not an online converter.

Either way, you now have thousands of .CSV files. These are simple text, and a script can read and merge them, creating one big .CSV file. For details about that, consuit a scripting forum. You can then import the .CSV data into the database of your choice. Personally I would recommend Perl for this work, though JScript is another possibility, essentially the same language as Javascript. Either can easily read and merge .CSV files.

If this solved your problem please go to your first post use the Edit button and add [Solved] to the start of the title. You can select the green checkmark icon at the same time.
Mr. Programmer
AOO 4.1.7 Build 9800 on MacOS 10.14.6.   The locale for any menus or Calc formulas in my posts is English (USA).
User avatar
MrProgrammer
Moderator
 
Posts: 3952
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: Query and store results from multiple spreadsheets

Postby jrkrideau » Sun Feb 16, 2020 11:08 pm

@ MrProgrammer
It may take a few minutes to install some packages but I believe it can be done is about 5 lines of code in R.
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Query and store results from multiple spreadsheets

Postby Lupp » Sun Feb 16, 2020 11:23 pm

And would R also help to extract the data from the perverse representation they are supplied in?
On Windows 10: LibreOffice 7.0 and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2918
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: Query and store results from multiple spreadsheets

Postby jrkrideau » Mon Feb 17, 2020 12:53 am

Lupp wrote:And would R also help to extract the data from the perverse representation they are supplied in?

I am not actually sure of the question but I think so. See the attached .csv file. This is my best guess of what the OP wants. It can be supplied in .ods & .xlsx formats but I had it in .csv format as I was playing around with reading it into SQlite. I know nothing about databases and it shows. :knock:
 Edit: Replaced the .csv file with, I hope, a non empty file 
Attachments
tbl.csv
(443 Bytes) Downloaded 13 times
Last edited by jrkrideau on Mon Feb 17, 2020 2:57 am, edited 1 time in total.
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Query and store results from multiple spreadsheets

Postby Lupp » Mon Feb 17, 2020 1:17 am

Sorry. The above attached csv only contained two empty lines. (One byte: the newline character.)
But my question was not about how to process data supplied by csv. It was about how the OQ might get the data for evaluation out of his obscure .xls (or .xlsx) files. In fact the simplest way I personally know and have the software for is to open every file and to copy the data to a usable representation.
For opening the MS spreadsheets the only reliable tool I have at hand is LibO (in case of .xls also AOO).
For delivering the extracted data I then also used a spreadsheet, now a Calc type, because this is a Calc forum. Of course you can also add a line to the the little Basic Sub I supplied which writes to a csv file instead.
In fact I would prefer the .ods after the first step because some interim adjustments may be needed for which the visual control you have when working in a spreadsheet in combination with the functional tools should be of advantage.
Doing the same work based on any general purpose programming system, I would need to precisely and finally define the structure/syntax of the data as found and of the data as needed as well. Based on what we were told by the OQ I cannot even extract the relevant portions of information from the ID-rows in the files.
You will also have noticed, that every file actually should be converted to a one-row-dataset.
What my "wizard" made of the three given example files you find attached.
result.ods
(21.66 KiB) Downloaded 18 times


By edditing I add an attachment showing how the raw data might be treated in a few steps. This needed a minute for the three datasets, and might need 2 minutes for 10000. This again cannot address, of course, the task to extract info (What product? What shop? e.g.) from the compound "Reference-ID" or the "Published" line. No way to know for me. The OQ is the one to do that.
resultTransposedInPartsFilteredAndPrettified.ods
(23.75 KiB) Downloaded 15 times


If you didn't notice my first post in this thread: Here it is viewtopic.php?f=9&t=101108#p487240.
On Windows 10: LibreOffice 7.0 and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2918
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: Query and store results from multiple spreadsheets

Postby jrkrideau » Mon Feb 17, 2020 2:55 am

Very nice. I had a look at your first post and file and decided it was too complicated for me.

I liki the way you have transformed the data. I did not bother so currently I have three rows of data for each fire so the id is not a unique identifier.

I am sorry about the empty file. I was playing about getting the data into an SQlite table and seem to have done something stupid in the R file. I send the post and went to dinner. Just a few minutes ago I realized the dataframe was empty hence the .csv file. I was thinking I should check it on the forum when I saw your message.

I will attach a new one which will, I hope, contains some data.
Attachments
tbl.csv
(443 Bytes) Downloaded 17 times
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Query and store results from multiple spreadsheets

Postby peace2 » Tue Feb 18, 2020 7:41 pm

Thanks Lupp for the furtherance of my issue! Based on the your scripts and info I am able to solve my tasks, as set forth in my opening input above.

The data as provided from here in the .xls spreadsheets was extracted and merged into one spreadsheet. To clarify: My original data is in .xlsx format. However, i do not use Excel. The original data is represented in a satisfactory way using OOcalc. I could have uploaded the .xlsx files, but they hold proprietary information. So, after changing proprietary info and loading each of them into OOcalc, I saved each in .xls format, which OOcalc readily provides.

I looked into your Macro 'ProcessThisMess', found the entry that calls .xls files and changed the entry to: .xlsx ... my assumption was: if it works for .xls, why shouldn't it do so for .xlsx? And it worked! Your Macros processed over ten thousand single files in about two hours. The result was an OOcalc spreadsheet representation and file is just 1,5mb in size.

Next: I applied a standard OOcalc filter to column B and was able to perform the queries I needed. Since the filter output contains the row number, it is of no great effort to subsequently look up corresponding product and reference IDs.

So Lupp: :bravo: and thanks again.

peace
openoffice 4.1.6
windows
peace2
 
Posts: 7
Joined: Sat Feb 15, 2020 8:03 pm

Re: Query and store results from multiple spreadsheets

Postby peace2 » Tue Feb 18, 2020 7:49 pm

Some general and closing statements:

Villeroy wrote: In order to get better reports, you have to use the resource that produces the reports.

I disagree here. If u have a dataset produced by SW "X" and apply a plug-in/module/script etc. that strips the dataset of proprietary control characters etc., the u can process/manipulate the data with almost any 'resource' (SW "Y").

MrProgrammer wrote: If they've selected you for this work, they should supply a copy of Excel for you. If they are reluctant, you can explain that a non-Excel solution introduces additional, unknown risk.

This is well understood. However, I am convinced, that OOcalc is very close to Excels functionality (as are the other modules in OO), and I propagate and support this. ...and that the OO community can get the job done!

MrProgrammer wrote: Personally I would recommend Perl for this work, though JScript is another possibility, essentially the same language as Javascript. Either can easily read and merge .CSV files.

Sure, there are ways to solve the task in a more elegant manner. However, I would assume that most of the OO users don't command your expertise.
openoffice 4.1.6
windows
peace2
 
Posts: 7
Joined: Sat Feb 15, 2020 8:03 pm

Re: [Solved] Query and store results from multiple spreadshe

Postby Lupp » Wed Feb 19, 2020 1:38 am

@peace2: The second hour you needed beyond my estimate of one hour may be due to a more costly treatment of .xlsx as compared to .xls.

@MrProgrammer: Just for curiosity :: Do Perl and JScript actually provide efficient means to extract information from .xlsx files?
On Windows 10: LibreOffice 7.0 and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2918
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: [Solved] Query and store results from multiple spreadshe

Postby Villeroy » Wed Feb 19, 2020 2:50 pm

Lupp wrote:@peace2: The second hour you needed beyond my estimate of one hour may be due to a more costly treatment of .xlsx as compared to .xls.

@MrProgrammer: Just for curiosity :: Do Perl and JScript actually provide efficient means to extract information from .xlsx files?

Perl is the most powerful text parser. Once you have extracted the XML, there are no limits.

Search result for docx on cpan.org: https://metacpan.org/search?q=docx which is a repository of Perl modules.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04, no OpenOffice, LibreOffice 6.4
User avatar
Villeroy
Volunteer
 
Posts: 28573
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [Solved] Query and store results from multiple spreadshe

Postby Lupp » Wed Feb 19, 2020 4:37 pm

Villeroy wrote:Perl is the most powerful text parser.

I never used Perl (and don't actually consider to change that), but I wouldn't doubt your educated opinion.

Villeroy wrote:Once you have extracted the XML, there are no limits.

Yes. Once I have (someone has) extracted...

My concern (when I prepared my answer) was the extraction. The rest seemed rather simple.

Regarding the extraction of data from .xlsx there seems not to be built-in capability in Perl if I understood your post correctly. To "parse" by eye and mind what the link you gave returns, and to find a good reliable solution this way for a task that's probably one-time-and-never-again would cost me 10 times the time (rather much more) I needed for a handyman-solution in Basic.

I don't aim at becoming a professional programmer of nowadays. And my toolbox concerning traditional "hardware" consists of half a dozen screwdrivers, a few hammers, and a single electric borer. I love doing things with simple, easily understandable means. I also love to use these means in a clever, creative way. That may not be the profession-do. "Amateur" stems from love, not from earnings and not from duty - which I, of course, also respect.
On Windows 10: LibreOffice 7.0 and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
Lupp
Volunteer
 
Posts: 2918
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: [Solved] Query and store results from multiple spreadshe

Postby Villeroy » Wed Feb 19, 2020 7:26 pm

Perl is a full featured programming language. It can handle zip archives, extract the all values from a cell range (columns A and B in this case) from any kind of valid XML until it hits a blank row or whatever. I do even think, this is a matter of max. 10 lines of Perl code (20 if I would try). Many Perl coders take the sportive challenge to wrap such programs into a single line which is why Perl is often described ironically as a "write only language". Whatever has been written by some Perl enthusiast is incomprehensible by anybody else and after one or two years not even the author understands his own code.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04, no OpenOffice, LibreOffice 6.4
User avatar
Villeroy
Volunteer
 
Posts: 28573
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [Solved] Query and store results from multiple spreadshe

Postby MrProgrammer » Wed Feb 19, 2020 7:40 pm

Lupp wrote:@MrProgrammer: Just for curiosity :: Do Perl and JScript actually provide efficient means to extract information from .xlsx files?
My suggestion was to use Excel to convert the thousands of XLSX files to thousands of CSV files, then use a scripting language to do the rest of the work, extracting the desired lines/fields from the CSV and merging them. But Perl does have a wide variety of modules which allow it to read/write foreign/proprietary formats including CSV, XML, PDF, JSON, JPG, DOC, RTF, ODS, XLS, and XLSX. JScript could call Excel's API, I believe.
Mr. Programmer
AOO 4.1.7 Build 9800 on MacOS 10.14.6.   The locale for any menus or Calc formulas in my posts is English (USA).
User avatar
MrProgrammer
Moderator
 
Posts: 3952
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: [Solved] Query and store results from multiple spreadshe

Postby Villeroy » Wed Feb 19, 2020 8:10 pm

OK, it won't hurt to give a quick try:
1) Put a copy of all your .xls files in some directory.
2) Download the document I'll attach to this topic and save it in the same directory.
3) Call Tools>Options>Security... button [Macro Security], tab "Trusted sources" and add that directory to the trusted sources (documents in that directory are allowed to execute their macro code)
4) Open my document and click the [GO] button.
If all the xls files are the same as the 3 example files (2 columns of data starting at A1 with a blank row between the first rows and the "Months" rows), the macro code should extract all relevant data (from A1 to Bxx without the months table) with the associated file names.

The document is made from this template: viewtopic.php?f=21&t=77069
The only specification I made on the setup sheet is *.xls as file name pattern. This means that we import data from each .xls file's first sheet in the same directory as this .ods file. I made one modification in the Basic code. The original code fetches the used area of each source sheet. With the modification it fetches the current region around cell A1 of each source sheet.
Attachments
topic_77069.ods
(27.3 KiB) Downloaded 14 times
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04, no OpenOffice, LibreOffice 6.4
User avatar
Villeroy
Volunteer
 
Posts: 28573
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [Solved] Query and store results from multiple spreadshe

Postby jrkrideau » Wed Feb 19, 2020 8:53 pm

Aha, I got it to work!

Am I misreading the OP's posts? I thought he was only really interested in the Sales figures but I see both you and Lupp include extra info.
LibreOffice 6.4.3.2; Ubuntu 20.04
jrkrideau
Volunteer
 
Posts: 3788
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: [Solved] Query and store results from multiple spreadshe

Postby Villeroy » Wed Feb 19, 2020 11:18 pm

Code: Select all   Expand viewCollapse view
Reference-Id:   AB_2014_00002_789123456b2cg4
Published:   789123456b2cg4
URL:   http://url3/index.php/cat14/article17 (6048)
   http://url3/index.php/cat14?page=3 (12)
   http://url3/index.php/cat14 (3)
Sales 2012:   2995
Sales 2013:   12230
Sales 2014:   6048
Reference-Id:   AB_2013_00002_456789123a1bc2
Published:   456789123a1bc2
URL:   http://url12/index.php/cat3/article5 (20380)
   http://url12/index.php/cat3 (20)
   http://url12/index.php/cat3?page=6 (320)
Sales 2012:   2764
Sales 2013:   7395
Sales 2014:   10358
Reference-Id:   AB_2013_00002_456789123a1bc2
Published:   456789123a1bc2
URL:   http://url12/index.php/cat3/article5 (20380)
   http://url12/index.php/cat3 (20)
   http://url12/index.php/cat3?page=6 (320)
Sales 2012:   2764
Sales 2013:   7395
Sales 2014:   10358

After importing the files, it should be verified that the imported data
1. include the wanted 2 columns plus the file names and nothing beyond column F
2. that every block of data consists of 8 rows: Reference-Id, Published, 3 URLs, Sales 2012, Sales 2013 and Sales 2014 (and not any sales of 2011 or 2015 with 2 or 4 URLs)
Then it should be possible to utilize the INDEX function in order to generate a normalized table.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04, no OpenOffice, LibreOffice 6.4
User avatar
Villeroy
Volunteer
 
Posts: 28573
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany


Return to Calc

Who is online

Users browsing this forum: RPG and 30 guests