Apache OpenOffice Community Forum

Posted: **Tue Dec 10, 2013 3:39 pm**

I had set up a split database to store product and order information at my company. The HSQLDB was run in server mode on a computer that is always on.

Last night, we had a power outage for a couple of hours. I came in this morning, opened openoffice, and it tried to recover the Base file and standalone Calc form that were open when the power went out. It gave an error message about socket creation, then seemed to pull up the documents, then crashed and went back to the recovery screen. Realizing the socket error was due to the HSQLDB not running on the server, I went to our server room and started up the server computer.

Came back to my computer, tried the recovery again, no socket error (good). But again, immediately after recovery, openoffice crashed and pulled up the recovery window again. Well, I hadn't made any changes to the Base or Calc files yesterday, didn't need them recovered, so I just canceled out of recovery.

I opened my Base file, got to looking around, and while the forms and queries are fine, the past six months of table data is gone. It's the tables I had in June. (Way ouch. If the boards had a weeping smiley I'd put it here.) And everything in the "Tasks" window is grayed out, I can't modify any of my forms or queries or write new ones.

What happened? Is there any way to get the data back? If not up to yesterday, I have a script file of the database I exported in October, can I use that to recover up to that point?

Posted: **Tue Dec 10, 2013 6:35 pm**

Hi MTP,

make a copy of the database files!
Have a look here

R

Posted: **Tue Dec 10, 2013 7:07 pm**

MTP:

In addition to the wise advice given by user F3K Total above, I am sending you a PM ( Private Message ).

Sliderule

Posted: **Wed Dec 11, 2013 7:47 am**

MTP,

I'm watching this thread with interest. I'll be interested to know:

1. If your recovered Base file is connected to the correct database files (and not an old copy somehow)
2. If you backed-up the files including the .log file immediately after discovering this issue
3. The status of your "modified = " flag in the .properties file
4. If the .log file can be incorporated into your .data or .backup files upon re-starting HSQLDB in server-mode or file-mode
5. If you can access the database locally using file-mode, such as with the mydb_wizard.odb file and hsqldb.jar (1.8.0.10) present in the folder
6. If you shutdown the HSQLDB server periodically, or at least ran CHECKPOINT DEFRAG occasionally
7. If you ran the database in a cloud-sync'd folder, or otherwise had a regular backup plan that didn't function properly

Posted: **Wed Dec 11, 2013 1:33 pm**

MTP:
Well, well, well, this appears to be a first.

Table data has disappeared from a split HSQL database.

If you have a backup of the database files
1. <dbname>.script should contain your table definitions. (text file)
2. <dbname>.data should contain your db data. (binary file)
3. <dbname>.backup should contain your backup data at last backup. (binary file)

The HSQL manual should give you instructions to 'import' these scripts into your database.
For obvious reasons, the backup file will only contain the entered data since it was made so,
some loss of data may/may not be present.

Good luck.

Greengiant224

Posted: **Wed Dec 11, 2013 5:43 pm**

And most importantly in this case:

4. <dbname>.log should contain all data generated since the last manual SHUTDOWN or CHECKPOINT, unless the file size threshold was reached causing an automatic CHECKPOINT.

HSQLDB 1.8 documentation wrote:
Running Hsqldb

HSQLDB can be run in a number of different ways. In general these are divided into Server Modes and In-Process Mode (also called Standalone Mode). A different sub-program from the jar is used to run HSQLDB in each mode.

Each HSQLDB database consists of between 2 to 5 files, all named the same but with different extensions, located in the same directory. For example, the database named "test" consists of the following files:

test.properties

test.script

test.log

test.data

test.backup
The .properties files contains general settings about the database. The .script file contains the definition of tables and other database objects, plus the data for non-cached tables. The .log file contains recent changes to the database. The .data file contains the data for cached tables and the .backup file is a zipped backup of the last known consistent state of the data file. All these files are essential and should never be deleted. If the database has no cached tables, the .data and .backup files will not be present. In addition to those files, HSQLDB database may link to any formatted text files, such as CSV lists, anywhere on the disk.

While the database is operational, a .log file is used to write the changes made to data. This file is removed at a normal SHUTDOWN. Otherwise (with abnormal shutdown) this file is used at the next startup to incorporate the changes. A .lck file is also used to record the fact that the database is open. This is deleted at a normal SHUTDOWN. In some circumstances, a .data.old file is created and deleted afterwards.

Note: When the engine closes the database at a shutdown, it creates temporary files with the extension .new which it then renames to those listed above.

properties file wrote:
hsqldb.log_size=10

The value is the size in megabytes that the .log file can reach before an automatic checkpoint occurs. A checkpoint and rewrites the .script file and clears the .log file. The value can be changed via the SET LOGSIZE nnn SQL command.

So while it may be normal to leave a database server running for months at a time, the recovery mechanisms are enhanced -- beyond the single .log file -- if we run CHECKPOINT regularly (daily) and CHECKPOINT DEFRAG (or SHUTDOWN COMPACT) periodically.

...

Posted: **Wed Dec 11, 2013 8:13 pm**

I have very little time to work on this currently, due to unfortunate coincidence of having a major audit Mon-Thursday this week.

It had been so long since I'd worked on anything with the file setup I actually didn't remember where the database file was at first, and under stress of having to provide paperwork to our shop floor before the audit started, added a new order to the broken database. Only later did I copy the files. I'm afraid this may have created new (bad) files.

Sliderule was amazingly kind in looking over my files yesterday, and initially only found the single order I had added yesterday morning. He was eventually able to pull the information up through June, but all the files (.properties, etc) seem to have been reverted to their state in June.

DACM, thank you for the information on checkpoint. I was wondering if never shutting down the server was what made the system susceptible to data loss of this kind. So if I make the logsize smaller, it will automatically make checkpoints more often?

I have contacted our contractor who does daily backups of our servers to see if he can recover the database files from 9 Dec. Hopefully that will lead to recovery.

Posted: **Thu Dec 12, 2013 5:16 am**

MTP wrote:I added a new [record] to the broken database. Only later did I copy the files. I'm afraid this may have created new (bad) files.

That may be true, but it's not your fault. After a power loss, it's best to make a copy of the split-database folder before re-starting the RDBMS (database server app). But that's an extraordinary measure.

MTP wrote:...checkpoint. I was wondering if never shutting down the server was what made the system susceptible to data loss of this kind.

Running database server application (RDBMS) 24/7 shouldn't be a problem unless other software anomalies crop-up over time -- such as incomplete garbage collection by the Operating System, framework, or virtualization software. However, in the absence of regular (daily) CHECKPOINT, the layers of protection are reduced. Based on the normal file scheme employed by all the named-brand database systems (Oracle, MS SQL Server, DB2, PostgreSQL, MySQL, etc.), an open file-handle to the .log file (or equivalent) does increase the chances of acute file-corruption (session data loss). This dynamic .log file contains the session data since the last CHECKPOINT or SHUTDOWN. The design can handle an instantaneous power-cut quite well, but anything becomes possible with a brown-out or electrical surge, to include loss of session data since the last CHECKPOINT. That's where a laptop-based server or UPS (battery backup) comes in handy; you can even trigger a script based on battery level to issue SHUTDOWN utilizing sqltool.jar (as included with HSQLDB and with the Base templates and associated tutorials).

Routine tasks such as running CHECKPOINT (daily) or the occasional SHUTDOWN (weekly or monthly) of the database and engine are prudent. These commands can be issued manually (Tools>SQL), or automated by scheduled script or macro. Either of these commands will increase the layers of protection by incorporating the session changes (the open .log file) into the relatively static .data and .backup files. The static nature of these files reduces the possibility of file corruption from power fluctuations, as the durability of the hard-drive subsystem becomes the limiting factor.

Another reason to issue the CHECKPOINT command is that third-party backup software may ignore open (locked) files by default. So the .log file may not be backed-up in all cases. And because the .log file is dynamic in a running database, a third-party backup may not be useful for data-recovery unless preceded by a CHECKPOINT. Since modern backup software monitors file-changes in real-time, simply issuing a CHECKPOINT at any time on a running database will trigger an automatic backup of the database (complete with off-site redundancy in the case of cloud backup services).

Now, HSQLDB 2.x is designed to accommodate much larger databases perhaps running in high-concurrency environments (many active users). In my experience on a relatively slow 2009 laptop (see chart below), CHECKPOINT can process about 35MB per second. So most Base users would experience a sub-second delay for CHECKPOINT processing (virtually instantaneous). On the other hand, a 1GB database would require about 30 seconds of "down-time" as the database transactions are paused during a CHECKPOINT. Now imagine the impact with a multi-Terabyte database ( 8 TB limit for normal data with HSQLDB 2.3 ). So under extreme conditions, full CHECKPOINT processing is no longer the best option. That's why HSQLDB 2.3 offers a "hot-backup" feature. This feature is designed to backup large databases in high-concurrency environments, where performance would otherwise become unacceptable during lengthy CHECKPOINT command processing. Incremental backup (effectively incremental CHECKPOINT) features have also been added to accommodate these extreme environments.

With proper use of the LOB data-type (Large Object Binary), I wouldn't imagine that we have very many Base users with .data files in the Gigabyte range, muchless Terabytes. So with a modest database of under 100 MB (.data file size), there's little reason to pursue anything beyond CHECKPOINT on a running database, while employing daily or real-time backup software with history/version features such as we get for free with cloud sync'd folders. In summary, I would rely on CHECKPOINT on running databases in less demanding environments as it provides the best trade-off between down-time (mere seconds) and completeness.

Sliderule and I investigated the performance of SHUTDOWN / CHECKPOINT with HSQLDB a few years ago with the following results:

DACM wrote: Here's the results as reported by SQL Workbench/J:
Code: Select all
SIZE (# of records)       SHUTDOWN        SHUTDOWN COMPACT
0.83 MB (5,000 rows  )    0.07 seconds     0.34 seconds
1.67 MB (10,000 rows )    0.10 seconds     0.56 seconds
4.13 MB (25,000 rows )    0.15 seconds     1.25 seconds
8.25 MB (50,000 rows )    0.27 seconds     2.34 seconds
16.5 MB (100,000 rows)    0.48 seconds     4.66 seconds
33.0 MB (200,000 rows)    0.95 seconds     11.0 seconds
66.0 MB (400,000 rows)    1.79 seconds     20.7 seconds
132  MB (800,000 rows)    3.66 seconds     42.0 seconds
These tests were run on a laptop with a mid-range mobile CPU and a mechanical hard drive. For comparison, the CPU is one-third of the speed of an i7 mobile CPU, and a fraction of the speed of the fastest consumer CPU today. The hard drive subsystem is also a fraction of the speed of a modern Solid State or Hybrid Drive.

MTP wrote:So if I make the logsize smaller, it will automatically make checkpoints more often?

You can reduce it to 1MB but that may not trigger a CHECKPOINT very often...? I can't recommend reliance on this file-size trigger, in general.

The long-term reliability of any database is predicated on the diligence of the database administration. This admin can be automated through scheduled scripts utilizing SQLTool to dispatch SQL commands, or by a Form macro (autorun or push button). In light of Murphy's Law, multiple of layers of protection are prudent...

Prudent layers of protection for critical data (perhaps in order of importance):

Layer 1: A stable computing environment with sufficient RAM

Layer 2: Battery backup power (implies built-in surge protection and a degree of line-conditioning...simply use a laptop or a UPS)

Layer 3: A transactional RDBMS with ACID properties and sufficient backup and recovery automation (HSQLDB, H2, PostgreSQL, MySQL, Firebird, etc.)

Layer 4: Regular CHECKPOINT (or VACUUM) of the running database
- 4a. Issue this command at least daily, perhaps using a scheduled script or Form macro automation (autorun or push button)
  
  4b. Specialized "hot-backups" and/or "incremental backup" measures may be necessary with very large databases (GB+) in high-concurrency environments where CHECKPOINT performance proves unacceptable
  
  4c. HSQLDB 2.x does not maintain a .backup file by default, so if you don't see this file in your database folder, then issue the following command using the Tools > SQL console in Base to initiate "full" backup upon CHECKPOINT or SHUTDOWN. You will notice "... INCREMENT FALSE" (below) which enables "full" backup as opposed to "incremental" backup. With very large databases (Gigabyte .data file size) "incremental" backups become necessary to minimize CHECKPOINT processing delays in demanding environments. Either backup mode is sufficient.
  - Code: Select all
```
SET FILES BACKUP INCREMENT FALSE
```
Layer 5: Employ additional backup layers with file-history (versioning) features
- 5a. These features are provided in real-time (automated instant backup) by most cloud sync'd folders such as Dropbox, Google Drive, or similar. This type of off-site folder synchronization is super easy but privacy issues emerge; client-side encryption such as provided by Boxcryptor, Wuala and SpiderOak becomes highly desirable. Each time you issue a CHECKPOINT or SHUTDOWN command, the changed files are backed-up immediately to your local machine and to the cloud, with access to previous versions of each file.
  
  5b. Selected backup software can provide similar file history to both internal or external drives, to include NAS or even FTP sites.
  
  5c. HSQLDB 2.3.x offers "incremental" and "online" (hot backup) facilities. "Incremental backup" is designed to enhance CHECKPOINT or SHUTDOWN performance on very large databases (Gigabyte .data file size), as mentioned previously. We can switch between "incremental" and "full" backup with HSQLDB at anytime. "Online" hot-backup adds the ability to initiate a full backup of a running database as a single compressed file (default option) such as for archiving purposes. When combined with "incremental" backup, "online" hot-backup is an efficient way to backup a large database in a demanding (high-concurrency, multi-user) environment. The associated backup command can be run from a batch file using the provided SQLTool on a schedule (leveraging a script or the operating system). This built-in backup option can be a good choice, particularly when 5a/b above (cloud-folders or third-party backup software) are not an option. But keep in mind that options 5a. and 5b. above maintain a backup of the entire split-database folder including front-end components (Base .odb) plus any subfolders used for external image or document storage. And 5a. (or even 5b.) provides an off-site copy utilizing personal encryption as necessary. So when 5a./5b. can be employed, they provide a more thorough solution, invoked simply by running CHECKPOINT or SHUTDOWN -- which can be automated by batch/script or a push-button macro on a Form.
Layer 6: Periodic SHUTDOWN [COMPACT] of the RDBMS (perhaps weekly) and server hardware (perhaps monthly)

Layer 7: Employ RDBMS-based clustering

[/color]

MTP wrote:I have contacted our contractor who does daily backups of our servers...Hopefully that will lead to recovery.

Let's hope

Posted: **Thu Dec 12, 2013 8:59 pm**

We do have a UPS on our servers; the power was out for an hour and a half and the battery didn't last that long.

Our contractor tried to pull information from our backup remotely and found there was a problem with the backup system. He's coming out Monday to look over the setup in person, but so far it's not looking good.

I've started working on the database definition export I made in September, putting all the SQL commands into a new embedded Base file with the intention of splitting it out, adding the new entries I've made for the last couple of days, and then just dealing with the data loss, probably trying to recover as much as possible from the printed paperwork generated by the database (at least it's three months instead of six).

I'll definitely be setting up CHECKPOINT somehow... maybe a macro tied to the event of the Calc stand-alone form opening? That's something that happens at least once a day.

Posted: **Thu Dec 12, 2013 9:01 pm**

Thanks DACM for posting the relevant information about backing-up HSQL databases.

MTP (the OP) is using a split v1.8 database. I believe that the only back-up of a v1.8 db is
an SQLDump of the file whilst it is running.
You can use third party backup tools to make a copy of the DB when you have closed down the DB,
either after issuing a checkpoint, checkpoint defrag, shutdown or a shutdown compact
command. These appear to have not been done so the chances are very slim that he will find some usable data recovery.
Perhaps, just perhaps his contractor will have made some/a usable backup of the data.

With the v2.* series of HSQL you can of course still make an SQLDump plus what is called a 'hot backup'
whilst the db is running and a back-up as <databasename>.tar (tape-archive) or a <databasename>.tar.gz
(tape archive gzip) when you close it down.

These features are built into my windows server start/stop tray app. but will only work when you shutdown the server
as outlined above.

Maybe, MTP will consider looking at utilising the newer HSQL v2.3.0 jar with the added back-up features.

Greengiant224

Posted: **Thu Dec 12, 2013 9:56 pm**

Greengiant224 wrote:I believe that the only back-up of a v1.8 db is
an SQLDump of the file whilst it is running.

Maybe, MTP will consider looking at utilising the newer HSQL v2.3.0 jar with the added back-up features.

Not necessarily. Issuing CHECKPOINT (or variant) is sufficient to incorporate the session data (.log) into the .data and .backup as applicable (assuming cached tables). So any third-party backup system will pick-up a copy of these static files, and either one is sufficient to recover the database. The addition of "Hot Backup" in HSQLDB 2.3.x is not the recommended method in a typical Base environment. I re-wrote my post above to highlight the differences, but with a modest database (under 100 MB and proper use of LOB data-types) the processing delay with CHECKPOINT is insignificant. Since CHECKPOINT does a more complete job, it is the preferred method according to the HSQL documentation.

Posted: **Fri Dec 13, 2013 6:29 pm**

MTP wrote:We do have a UPS on our servers; the power was out for an hour and a half and the battery didn't last that long.

Hmmm...
A UPS should have eliminated the power-fluctuation issues through surge-protection and line-conditioning, so those factors can be eliminated. And a descent UPS is connected by USB cable, so monitoring software can command an orderly shutdown of the server as the battery-life becomes critical. So I'm quite perplexed over your loss of session data (suspected .log file corruption or improper flag in your .properties file causing session overwrite upon restart).

MTP wrote:Our contractor tried to pull information from our backup remotely and found there was a problem with the backup system. He's coming out Monday to look over the setup in person, but so far it's not looking good.

Not good news. It's amazing that we have much better backup service for free using a Dropbox, etc. And these cloud services backup changes in real-time, so each time you issue a CHECKPOINT you'll immediately get an on-site and off-site backup with file-history. Of course, your server backup software may have similar features when setup properly.

MTP wrote:I'll definitely be setting up CHECKPOINT somehow... maybe a macro tied to the event of the Calc stand-alone form opening? That's something that happens at least once a day.

Great idea. Here's an example macro:

Code: Select all

Sub Backup (oEvent As Object) 'Button > Execute > event
	oForm = oEvent.Source.Model.Parent 'MainForm from Button
	oStatement = oForm.ActiveConnection.createStatement() 'Create an SQL statement object
	sSQL = "CHECKPOINT" 'or CHECKPOINT DEFRAG or SHUTDOWN or SHUTDOWN COMPACT
	oStatement.executeUpdate( sSQL ) 'Execute the SQL command
End Sub

Posted: **Mon Dec 16, 2013 2:29 pm**

This incident highlights some deficiencies in our Base community support with respect to running a database in server-mode. I certainly failed to emphasize the implications of running a server continually without a CHECKPOINT or SHUTDOWN.

But again, CHECKPOINT and additional backups are just added layers of protection, because databases are designed to handle massive session data reliably, regardless of the timeframe (which is relative).

It should also be understood that this incident occurred while running the database engine in server-mode. In general, server-mode is even more robust than running in single-user file-mode (such as with a Base template), since the database engine remains running in server-mode in the event of a front-end (Base) crash. In this case/topic, the server hardware experienced a power outage, resulting in an un-orderly shutdown of the database engine. The engine is designed to recover from this type of incident without user intervention or data loss. In fact, we typically run HSQLDB in file-mode with Base (in single-user environments), where we replicate a power-loss and associated un-orderly shutdown of HSQLDB every time the front-end (Base) crashes. So whether we know it or not, we commonly take advantage of the automatic recovery mechanisms built-into HSQLDB when Base crashes while running in single-user (file) mode. Until this incident, we've never(?) experienced data loss when running a 'split database' configuration -- whether running in file-mode or server-mode. And while the dubious 'embedded database' file configuration does employ HSQLDB in file-mode, the rampant file-corruption issues with 'embedded databases' are unrelated to HSQLDB; the 'embedded database' file-corruption issues are caused by- and isolated to- the file re-packaging process performed [unnecessarily] by Base. So Base commonly corrupts 'embedded databases' due to bugs in Base, but we're not accustomed to data-loss with a 'split database' configuration because Base is no longer handling the database files.

We typically run 'split HSQL databases' in file-mode in single-user environments since it allows us to manage the HSQLDB engine without user-intervention. When we close the Base (.odb) file, Base issues the SHUTDOWN command to HSQLDB. This effectively updates the HSQLDB .data and .backup files upon closing the Base front-end, as part of our daily routine. Since updated .data and .backup files increase the durability of the database, this [file-mode] can become more reliable than running in server-mode without daily CHECKPOINT maintenance. The bottom line is, server mode is inherently more robust but it requires additional user-intervention or scripted automation, not only to manage the engine, but also in the form of database administration such as daily CHECKPOINT and perhaps occasional SHUTDOWN.

I should also mention that file corruption is possible with any database engine, so additional layers of protection as outlined above are the only remedy.
...

Posted: **Tue Dec 17, 2013 5:30 am**

Sliderule helped me restore a database from the file I had created a few months ago (a script file? a SQL dump? I'm not exactly sure what the correct name for it is). I had named it 20130910 and initially thought it was from September 10th (hey, three months of data loss is better than six). Once we had the database open and running, I realized it was in fact made October 9th (it's amazing how thinking I had lost six months of data actually made losing 'only' two months seem good!)

I had only created the script file because I needed to delete and recreate one of my views (making changes to the underlying columns) and making the script was a way to figure out what SQL I'd used to create the view. I'm really glad I saved the script file even after I had the view recreated.

So, this is as resolved as its going to get. Thank you to everyone who posted information on improving the robustness of a backup/recovery system. I believe the future of my files will be good.

Apache OpenOffice Community Forum

[Solved] Data loss after power outage (split database)

[Solved] Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)

Re: Data loss after power outage (split database)