[HSQL, Base, Basic] Macro Driven CSV Import

Some examples to be used directly
Forum rules
No question in this section please
For any question related to a topic, create a new thread in the relevant section.
Post Reply
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

[HSQL, Base, Basic] Macro Driven CSV Import

Post by Villeroy »

Download link: https://www.mediafire.com/file/1elfk1wo ... t.zip/file

This demo covers the following scenario: You have some table which needs to be updated from csv files on a regular basis. Both, the data table and the text file, use the same primary key (aka ID No.). If some ID in the text file matches an ID in the data file, the table record should be updated from the text file. If some ID in the text file has no matching ID in the data file, the record should be inserted as a new table record.

I split the task into 3 steps:
1) Replace any existing file "textfile.csv" in the database directory with another text file selected in a file picker dialog.
2) Update existing records in the data table with records from the text file sharing the same ID number.
3) Insert new records that have no matching records in the data table.
It is easily possible to do all this in two steps or in one step without changing the macro code.

Exploring This Demo
The linked zip file contains a database document and a HSQL2 database with a HSQL database driver of version 2.4.1. A more recent driver may work as well. I'm not sure about that.

The same tricks should work with any database engine able to link tables with text files. However, I gave up on embedded HSQL 1.8 because I was unable to find a working UPDATE statement. HSQL2 supports SQL MERGE which makes things very easy.

Incoming CSV Files
The attached zip file also includes two sample files members_1.csv and members_2.csv.
Both csv files look like this:

Code: Select all

id;name;donation;birthday
12;Clara;99,66;31.12.1999
The header row does not matter. It is ignored anyway. First column is the ID column (common primary key with the data table), a text, a comma decimal and a German date in D.M.YYYY format, all fields separated by semicolons.

Open the database document. A macro will connect the document to its database automatically. Click the "Tables" section where you find a text table named "CSV", a table named "DATA" and a view "CSV2DATA". All of them return empty record sets when you open this database for the first time.

Linked Table named "CSV"
Any csv file with the above specified structure (semicolon separated, one header row, 4 columns for id, text, comma decimal and German date) can be linked to the text table named "CSV". The invalid decimals and dates are imported as VARCHAR (text). For now table "CSV" is empty because the linked csv file is missing.

The target table "DATA"
The target table named "DATA" has columns for text ("N"), decimal values ("V"), dates ("BD") and the primary key "ID" which is not an auto-ID because it gets its IDs from imported text files.

On-The-Fly Text Converter "CSV2DATA"
In database document's tables section there is also a view named "CSV2DATA". It converts the invalid decimal strings and date strings into valid decimals and dates. The view shows no records as long as the "CSV" table is empty. It is a known limitation that Base can not edit views directly. The query "qCsv2Data" is the editable version of the view "CSV2DATA". In case of a necessary change, you would delete the existing view, edit and save the query, right-click the query icon and choose "Create as view..." and save the view under the same name as the deleted one.

Hands On -- Manual Text Table Import
For better understanding let us skip the forms and macro stuff for now and import the first record set manually:
1. Shut down the office suite. Yes, completely including any "quick starter".
2. Copy members_1.csv into the database directory and rename the copy to textfile.csv. You may have to remove any existing textfile.csv before renaming when using a graphical file manager.
3. Open the database again and have a look at the text table which is linked to the textfile.csv. It shows 12 records with raw text data. The German dates and decimals are text. Only the ID is declared as integer. Open the view "CSV2DATA" which shows the text, true decimals, true dates and the ID in the same order of columns as the corresponding columns of the "DATA" table.
4. Close tables and view, copy the icon "CSV2DATA" and paste onto the "DATA" icon. On the import dialog just click the "Create" button.

Now you have manually appended the text file records to the "DATA" table via text table "CSV" (raw data import) and view "CSV2DATA" (conversion from raw text data into correct column types) with a simple copy and paste.

Macro Driven Update From Text Files
1) Open the import form. If you did the manual import, you see the imported table records on the left and the text file records in the middle. On the right side you see the table records having the same ID as the current text file. For now, these 3 record sets are identical. If you did not do the manual import, click the first push button labeled "1. Replace Text File" and navigate to "members_1.csv". Then click "3. INSERT" and you get the same form contents.

2) Click "1. Replace Text File" and navigate to "members_2.csv". This record set includes 2 existing records with ID numbers 5 and 12 shown in the yellow table and a new record with ID 678 in the green table. In the white table on the right you see the table records with IDs 5 and 12 corresponding to the yellow records.

3) Click "2. UPDATE" to overwrite the white table records with the yellow ones.

4) Click "3. INSERT" to append the green records.

If you did the manual import, you already know what the first macro does. It replaces textfile.csv with a file chosen by a file picker dialog.
The other 2 buttons trigger UPDATE and INSERT commands. The 2 commands are documented in the second form document "SQL".

Customizing the Basic code
For your own solution based on my macros the following issues are important:
1) The "flavour" of your csv files. It determines everything else. A csv file having ISO dates and point decimals would not require any additional view to convert the strings into correct values.
2) The names of the text file, tables, views, all the column names and data types. A view can sync the data types and the column order of a text table with the data table like my view "CSV2DATA" synchronizes the types and column orders of the text table "CSV" (linked to textfile.csv) and the data table "DATA".
3) This form's UPDATE and INSERT buttons trigger SQL commands stored in the push button's property "Additional information". Just have a look at the push button properties. The "Additional Information" field is a one-line text. When you copy the SQL from a multi-line text, make sure that you have empty spaces where line breaks used to be in the original text. Multiple commands need to be separated by semicolon. For example, a semicolon separated command sequence can run INSERT, UPDATE and DELETE commands from the same push button in one go.
4) Most of the customization needs to be done in plain simple SQL. Once you reached the point where you worked out your UPDATE, INSERT and/or DELETE statements, you can customize my macro by editing the constants on top of Module1.
5.1) The first group of constants affects the behaviour of the file picker dialog including an optional default directory specified in system notation such as "C:\path\to\Downloads\". If cInitPath is empty, the file picker starts in the documents folder as configured in the office's path settings. Change constant cPattern to "*" if the file picker should display all files, "*.csv" shows files ending with .csv.
5.2) Constant cPath specifies the database directory if that happens to be on another path than the database document. An empty string indicates that the database directory can be found on the same path as the database document. Use system notation such as "C:\path\to\database_folder". In this demo cPath is empty.
5.3) cFile is the file name if cPath is given. If cPath is empty, cFile specifies the path to the text file starting at the document's path with slashes as path delimiters. In the current example it is "database/textfile.csv".

SQL statements to define the connection between HSQL table and text file
http://www.hsqldb.org/doc/guide/texttables-chapt.html and https://forum.openoffice.org/en/forum/v ... hp?t=23260 describe how to connect a text table with a text file.
In this demo I ran the following 2 commands to connect textfile.csv with text table "CSV". This is not possible to do in the Base GUI:

Code: Select all

CREATE TEXT TABLE PUBLIC.CSV(ID INTEGER PRIMARY KEY,N VARCHAR(30),V VARCHAR(8),BD VARCHAR(10));
SET TABLE PUBLIC.CSV SOURCE 'textfile.csv;fs=\semi;ignore_first=true'
The file replacement macro runs

Code: Select all

SET TABLE "CSV" SOURCE OFF
before replacing the textfile.csv After replacing the file it runs

Code: Select all

SET TABLE "CSV" SOURCE ON
in order to reconnect the "CSV" table with the new file.
Last edited by Villeroy on Fri Jul 15, 2022 2:34 pm, edited 1 time in total.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [HSQL, Base, Basic] Macro Driven CSV Import

Post by Villeroy »

WIth a little help from user Rafkus_pl I managed to do the same trick with embedded HSQL.
Everything is the same except for the file path in the Basic macro and the SQL to update and insert.
Download the TextImport_embedded.odb and use the sample csv files from the above linked package or anything like this csv flavour with comma decimals and German dates:

Code: Select all

id;name;donation;birthday
12;Alfred;56;31.12.1912
678;Beth;13,76;5.3.2001
5;Carlos;99,99;04.3.1948
Attachments
TextImport_embedded.odb
(29.06 KiB) Downloaded 373 times
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [HSQL, Base, Basic] Macro Driven CSV Import

Post by Villeroy »

Another example with embedded HSQL and code from [Basic, SQL] CSV import into some database
You have to adjust the URL-paths in the first module before using this sample.

---- EDIT: 2023-02-22 Documentation of working routine -----
This is the function which puts everything together:

Code: Select all

Function ImportCSV(sURL$, sDBPath$, sTextFile$, sTextTable$, sView$,  sDataTable$) As Long
  • The function returns the count of appended records.
  • sURL is the URL of the csv file to be imported.
  • sDBPath is the Path-URL of the database directory (document path in case of embedded HSQL, database directory in case of external HSQL)
  • sTextFile is the name of the text file in the database directory which is linked to a text table (“import.csv” in the sample).
  • sTextTable is the name of that HSQL text table. The text table imports all columns as text (VARCHAR) unless they are SQL compatible (“import_csv” in the sample).
  • sView is the name of the view which transforms the text table into a record set compatible with the target table (“import_View” in the sample).
  • sDataTable is the name of the target table (“Data” in the sample)
The code disconnects the text file from the text table, copies the text file (sURL) to the name sTextFile in the database directory, reconnects the replaced text file with the text table. Finally it performs INSERT INTO “Data” (SELECT * FROM “ImportView”).

The view named “ImportView” is a SELECT statement which rearranges the text table's columns into the right order matching the column order of the target table while converting incompatible strings (comma decimals and German dates) into compatible column types. Additionally it filters out duplicates already existing in the target table.

Since Base can not edit database views, I saved a query “ViewSource”. In order to edit the view, you delete the view, edit the query and do a right-click>"Create as View..." and provide the name of the view (“import_View” in the sample). The view will appear among the tables. HSQL can not see any queries stored in the Base document. However, it can see views because views are stored in the database itself.

The SQL code copied from the query:

Code: Select all

SELECT "B".* FROM (SELECT "A"."N" AS "Name"
 , cast(substr("A"."T",2,1)||':'||substr("A"."T",2,3)||':'||substr("A"."T",2,5) AS TIME) AS "Time"
 , NULL AS "Remark"
 , CAST(replace("A"."AMT",',','') AS DECIMAL(10,2)) AS "Amount"
 , FALSE AS "Check"
 , CAST(right("A"."D",4)||'-'||left("A"."D", "A"."SL1" -1)||'-'||substr("A"."D", "A"."SL1"+1, "A"."SL2"-"A"."SL1"-1) AS DATE) AS "Date"
 , "A"."ACC_ID" AS "Account"
 , NULL AS "ID"
 FROM(SELECT "import_csv".*, LOCATE('/',"D",1) AS "SL1", LOCATE('/',"D",4) AS "SL2" FROM "import_csv") AS "A") AS"B"
LEFT JOIN "Data" ON "Data"."Name"="B"."Name" AND "Data"."Time" = "B"."Time"AND "Data"."Amount" = "B"."Amount" AND "Data"."Date" = "B"."Date" AND "Data"."Account" = "B"."Account" 
WHERE "Data"."ID" IS NULL
ORDER BY "Date" ASC, "Time" ASC
Attachments
csv_import.odb
embedded hsqldb with macro
(27.84 KiB) Downloaded 336 times
import.csv
silly csv
(172 Bytes) Downloaded 341 times
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Post Reply