[Solved] Move duplicate rows to new column

Discuss the spreadsheet application
Post Reply
MGear
Posts: 4
Joined: Mon Apr 05, 2021 9:50 am

[Solved] Move duplicate rows to new column

Post by MGear »

Hi, I'm new to this forum! I'm importing product images as a CSV file to a database. Each row must have a individual key = SKU and one main image. Some products have additional images. However, the file that I should import has several rows with the same key (SKU). Example:

sku,image
product1,image1.jpg
product2,image2.jpg
product3,image3.jpg
product3,image3b.jpg
product3,image3c.jpg
product4,image4.jpg
product4,image4b.jpg
product5,image5.jpg
product6,image6.jpg

I need to separate the additional images (if any) to a third column: additional_images.

sku,image,additional_images
product1,image1.jpg
product2,image2.jpg
product3,image3.jpg,image3b.jpg;image3c.jpg
product4,image4.jpg,image4b.jpg
product5,image5.jpg
product6,image6.jpg

If there is more that one additional image they must be separated by ";"
(This is not a problem - this can be done with find/replace and there can exist a ";" on all rows, even single images)

This may be something like: If SKU is the same as previous row, then ... ?
A less elegant solution is find duplicates and move the additional images to another page somehow and import separately. This would have to be repeted if here is sevaral images.

I can't figure out how to do this. I do this on a regular basis so I'm looking for a solution that will be easy to work with. I have over 200000 rows so doing it manually is not an option...
Last edited by MrProgrammer on Tue Apr 13, 2021 3:38 pm, edited 1 time in total.
Reason: Tagged ✓ [Solved]
MGear
OpenOffice 4.1.9 / MacOS Big Sur
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Move duplicate rows to columns

Post by Villeroy »

You do not import the csv into a database. You import csv data into a calculator. A spreadsheet is a calculator. Spreadsheets have nothing in common with a database. There are no tables, no fields, no records in a spreadsheet.
This is a database task, particularly with hundreds of thousands of rows. Most databases have an aggregate function "GroupConcat" which concatenates strings from one field based on common values (groups) in other fields.

You may try my experimental Python macro which does this by grouping one spreadsheet column and concatenating another: viewtopic.php?f=21&t=59442
 Edit: You may want to change the line
sdelim = ','
into
sdelim = ';'
in order to separate by semicolon. You can do so before you push the install button. 
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
MGear
Posts: 4
Joined: Mon Apr 05, 2021 9:50 am

Re: Move duplicate rows to columns

Post by MGear »

I should have been clearer: I importing to Magento ecommerce software that uses a database (not the openoffice database). The way to import data is by using CSV files.
MGear
OpenOffice 4.1.9 / MacOS Big Sur
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Move duplicate rows to columns

Post by Villeroy »

The Magento database is able to calculate the aggregated table from the stored table data. It is a MySQL database according to https://devdocs.magento.com/cloud/docke ... abase.html
MySQL has a Group_Concat function: https://www.w3resource.com/mysql/aggreg ... concat.php

If you want to use the Magento database with your office suite, connect a Base document to the Magento's MySQL database. Writer can generate serial letters, Base's report engine can generate structured print-outs, Calc can calculate with database data.

As a benefit, you would not need any csv files anymore because you would work with the database itself.
Last edited by Villeroy on Mon Apr 05, 2021 2:25 pm, edited 3 times in total.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
MGear
Posts: 4
Joined: Mon Apr 05, 2021 9:50 am

Re: Move duplicate rows to columns

Post by MGear »

OK, thank you, I will defenetly look into that option. Currently that is not possible in our setup.

Any thoughts on how to do this in Openoffice Calc only?
MGear
OpenOffice 4.1.9 / MacOS Big Sur
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Move duplicate rows to columns

Post by Villeroy »

You certainly do NOT want to import the aggregation result into the Magento database. And I'm sure you can't because there might not be any table for this structure. The aggregate is a calculated result from stored data. It must not be stored in a separated table.

If you really want to export csv from database, then import csv into a calculator and then do that job on sheet, then feel free to use my macro program with a modified delimiter.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
MrProgrammer
Moderator
Posts: 4901
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: Move duplicate rows to columns

Post by MrProgrammer »

Hi, and welcome to the forum.
MGear wrote:I need to separate the additional images (if any) to a third column: additional_images.
Villeroy wrote:You certainly do NOT want to import the aggregation result into the Magento database.
I agree with Villeroy. This seems like a very strange layout for a database. Surely the original data layout will be more useful. It will be almost impossible to work with SKU, Image, Additional_Images in a database, I would think. However since you've asked, I would be happy to explain how to accomplish this blunder.
MGear wrote:Any thoughts on how to do this in Openoffice Calc only?
Import the data to sheet Before using the Text Import dialog. Both of your fields should be imported as Text. Then on sheet After convert the yellow cells from formulas to results. That is select these cells, Edit → Copy, Edit → Paste Special → ✓ Text → ✓ Numbers → ✓ Date & Time (but not Formulas) → OK. Use AutoFilter to display only the rows where End is 0; delete the rows in yellow; remove the filter. Now you are left with the rows where End is 1. Finally, File → Save As → Text CSV.
[Tutorial] Text to Columns and the Text Import dialog
202104061349.ods
Only three functions used: IF, T, and TRUE
(16.22 KiB) Downloaded 109 times
If you have not used filters, read section 7. Analyzing and modifying your data in Ten concepts that every Calc user should know.

It would be far simpler to accomplish this task without using OpenOffice where each time you have to import CSV, perform the actions above on the sheet, and then export CSV. Perl is already installed on your Mac, so can rearrange your data just by running a program. Direct further questions about Perl to a Perl forum. This Perl program uses Text::CSV which you will probably need to install with cpan -i Text::CSV. I realize you may not know Perl, but I will remind you that you didn't know how to accomplish the task with OpenOffice either, and using a Perl program will be much easier in the long run.
$ cat JoinCSV1.perl
#!/usr/bin/env perl
use strict; use warnings; use Text::CSV;         # Program initialization
my ($in, $out, $csv, $iar, @rec);                # Declare variables
$csv = Text::CSV->new();                         # Load CSV routines
open $in,  "<in.csv"  or die "in.csv: $!\n";     # Open input CSV
open $out, ">out.csv" or die "out.csv: $!\n";    # Open output CSV
$iar = $csv->getline($in);                       # Get ref to input array
@rec = (@$iar, 'additional_images');             # Set output header array
while ($iar = $csv->getline($in)) {              # While more CSV records
   if ($$iar[0] eq $rec[0]) {                    # If same product
      $rec[2] .= ';' if $rec[2];                 # Add ; if item not empty
      $rec[2] .= $$iar[1]; }                     # Add additional image
   else {                                        # If new product
      $csv->say($out,\@rec);                     # Write previous product
      @rec = (@$iar, ''); } }                    # Initialize next product
$csv->say($out,\@rec);                           # Write final product

$ cat in.csv
sku,image
product1,image1.jpg
product2,image2.jpg
product3,image3.jpg
product3,image3b.jpg
product3,image3c.jpg
product4,image4.jpg
product4,image4b.jpg
product5,image5.jpg
product6,image6.jpg

$ ./JoinCSV1.perl # Run the program above 

$ cat out.csv
sku,image,additional_images
product1,image1.jpg,
product2,image2.jpg,
product3,image3.jpg,image3b.jpg;image3c.jpg
product4,image4.jpg,image4b.jpg
product5,image5.jpg,
product6,image6.jpg,


If this solved your problem please go to your first post use the Edit button and add [Solved] to the start of the subject field. Select the green checkmark icon at the same time.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.6.3, iMac Intel.   The locale for any menus or Calc formulas in my posts is English (USA).
MGear
Posts: 4
Joined: Mon Apr 05, 2021 9:50 am

Re: [Solved] Move duplicate rows to new column

Post by MGear »

The file 202104061349.ods worked perfectly in this case, thank you very much! :D
MGear
OpenOffice 4.1.9 / MacOS Big Sur
Post Reply