Page 1 of 1

[Solved] Phantom records in data tables

PostPosted: Wed May 08, 2019 8:16 pm
by schadecp
I am having an odd issue with a complicated database running on Windows 7 with version 4.1.6 of open office.

I noticed some extra rows on a data entry form that replicated previous rows in the form. Because the control was a grid that is linked to a table with a unique primary key, there should never be duplicates. The rows always appeared at the end of the list.

I figured it was something with the logic, so I examined the table. It had 40 actual records, with primary key values 1 through 40. But the record counter said there were 58 records, and there were 18 extra records appended to the end which were copies of records 1-18. I queried using a simple "Select * from TABLE" and got the same 58 records. Interestingly, the 18 at the end could not be selected and the row pointer would not go to any of them. I queried using a count function, e.g. "Select count(X) from Table" where X exists on every record, and I got the correct answer of 40.

I thought the table might be corrupt, so I created a copy and inserted records from the original table. Same issue. New table had 18 phantom records. I wrote macro code to regenerate the table by hand and copy every field in every record, figuring that copying the table definition might have copied whatever was corrupt. Nope! Still 58 records.

I took the SQL that copied and regenerated the table and dropped one line from it. 39 records in the table with no phantoms. I went to an old version of the database and checked the comparable table. When I opened the table in question, the record counter reported 58 records. Interesting coincidence. I moved to the end of the table. It has 161 lines with no phantom records.

I deleted all but 40 records from the table and re-opened it. 58 records with 18 phantom records.

I generated the table using SQL in a new empty database. The table was not visible on the list of tables (had to close and reopen the database to see it). However, I could run a query against it. A simple select returned 40 records. Closed and reopened the database. Table now visible with 58 records. The query that had returned 40 records now returned 58.

I've attached the simple database and the generating SQL. The records are actual records from a production database, but I have replaced all text fields with random strings of the same length.

Is this a harmless but annoying bug or something that is going to bite me in some unpredictable way in the future? And what's with the number 58?

Re: Phantom records in data tables: something magical about

PostPosted: Wed May 08, 2019 9:23 pm
by Villeroy
After running an SQL command from within Base or from anywhere else, menu:View>RefreshTables forces Base to re-read the database.

A database table can have millions of records without any problems. The Base frontend loads only as many as needed to be displayed on screen. The rest is loaded as you browse through the record set. The record indicator at the bottom of a row set may display something like:
Record 13 of 42 *
where the * indicates that 42 is the amount of rows currently loaded. When you navigate to the last record, the full amount of rows is dispayed without the asterisk:
Record 978123 of 978123

Re: Phantom records in data tables: something magical about

PostPosted: Wed May 08, 2019 9:57 pm
by UnklDonald418
When you see something odd like too few records or when scrolling you see a bunch of duplicate rows the problem can usually be resolved by jumping to the last record. Sometimes you may also need to jump back to the first record.

Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 3:00 am
by schadecp
Right. Villeroy, there are only 40 records in the table, yet when I open it, the record count is 58, with an asterisk until I scroll to the end. Then it stays 58, but the asterisk goes away.

UnklDonald, Thanks, but I knew about the duplicated rows on scrolling too fast and the fixup (reload the screen). This isn't that. Reloading the screen does not fix it. And the rows aren't a single row copied multiple times, which is what I see when that kind of duplication occurs, but in my example database, the first 18 records replicated in sequence. This may be related to the other problem, but it is distinct.

It is a major annoyance, but so far I have not determined any failure of queries, so the database appears to be intact. I suspect a user interface issue. Has anybody looked at the database I uploaded to confirm the problem in another environment? I'd love it if the problem were a corrupted copy of OpenOffice, but I doubt that is the cause.


Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 3:07 am
by systemdocean
I checked your uploaded database on LibreOffice Base ( and I only see the 40 records in the table view, and the select only loads the 40 rows. I do not see any more than 40, nor is the GUI indicating there is more than 40.

Maybe try using LO instead of OO and see what you get.

Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 4:05 am
by schadecp
Thanks. It's good to know it works in LO (with which I am completely unfamiliar). Whether I can switch my project to LO depends on compatibility, as I have had to develop a bunch of macros to achieve the functionality I need. The website says "Forked in 2010" which is a long time ago.


Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 11:32 am
by Villeroy
Since beginning of this millenium the one and only purpose of was to introduce an open document format which is complex enough to depict all features of modern office suites while being simple enough to be implemented by as many products as possible. Back then, being fully compatible with the market leader's binary formats turned out to be illusional and dangerous. The ODF format turned out to be very conservative and even the market leader supports parts of it by now. First they had to fight the standard by introducing their own pseudo-standard with "Open" and "Office" and "XML" in the name.

LO may have a lot more featues but in the end LO produces the same document format. The database component is almost the same. The embedded HSQL version is identical. They work on a replacement driven by embedded Firebird databases.

Some improved Calc functions have AOO compatible variants. Most changes affect the user interface(s). 99% of the API remained the same. The Python2 macro language changed to Python3.

Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 3:50 pm
by schadecp
Thanks again. I will definitely look at LO. I'm going to leave this stream "Not Solved" in spite of those very helpful suggestions, as migration is out of the question in the short term. Your comments make me think it likely that migration would not be too difficult. My macros are in OOBasic, which is mostly, but not entirely, compatible with vBasic. They access a bunch of objects, whose properties and methods would have to be pretty much the same to avoid a huge reprogramming effort. The documentation of the AOO object model leaves a lot of unanswered questions that I solved through trial and error. Don't know about LO's documentation.

While all my experiments suggest the phantom records are harmless but annoying, I am hoping that someone who knows how Open Office Base is put together can tell me that definitively. I'm also *curious* about the 40 record threshold and the magic number 58, but that is nothing I have to know to keep my database in production.


Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 6:07 pm
by Villeroy
You must not write any macros without object inspection: [Tutorial] Introduction into object inspection with MRI. You better avoid embedded databases for productive purposes. Embedded HSQL is good enough for learning SQL and for demos but there is a risk of losing all data and the risk increases with the size of the database. It is easy to turn the embedded HSQL into an external one: [Python] Macro to extract and reconnect embedded HSQLDB

Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 6:51 pm
by UnklDonald418
I did some further testing on a Windows 10 platform and I found that a Base table with exactly 40 records caused problems for the OO table GUI in Base, no problems if there were 39 or 41 records. For me, when the table GUI loads a table with 40 records it reports Record 1 of 74*. If my screen size is too small to display all 40 records and I jump to the last row, everything works properly, however if I scroll down, after it reaches record 40 it restarts with "FormID" 1 and continues repeating rows to "FormID" 34 (record 74). After first opening the table in the table GUI if I expand the window size it immediately changes to Record 1 of 40.
I thought that using a CACHED TABLE might have something to do with the problem, but that didn't turn out to be the case. I also tested it using a JDBC connection instead of an Embedded database but that too made no difference.
I also confirmed that no problem occurred when I opened your database with LO.

It isn't clear if you are aware that an Embedded Base database is not appropriate for any database anticipating such a large data set that would require a cached table.

Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 7:25 pm
by RoryOF
I'm not a database user, but I have a (very vague) memory that a default screen for Base display handles 40 entries. If this is so, there may be a boundary test error in the code, and it ought be logged as a bug. In many cases a boundary test error (anywhere in computing) is merely an incorrect test, using - for example "=" rather that ">=" or similar.

Re: Phantom records in data tables: something magical about

PostPosted: Thu May 09, 2019 8:01 pm
by schadecp
Thank you, gentlemen. You have solved the problem. I will report it as a potential bug. I confirmed the window size change workaround.

UnklDonald: thanks for the extra advice about cached tables. The SQL I sent is exactly the same as AOO uses to create the table, so AOO apparently thinks the table will be large. Would you advise manually recreating the table without the CACHED option? I could copy the data easily enough.

Villeroy: thanks for the note about embedded databases. That's done for user convenience. I don't want to mess with installing an external database on a Mac OS X system by a user 500 miles away, especially since I don't speak Mac. A user wiped out her initial data entry by not remembering that the database can crash, and is now backing up very frequently, sometimes hourly. I used this database in production successfully for a year, so I have little doubt it's feasible with the embedded database, if inconvenient at times. If I'd had my 'druthers, I'd have done it in MSAccess, which is pretty stable. But there's no version for Mac.

And thanks for the link to MRI, which I will install hasta pronto. My "object inspection" is somewhat more primitive than MRI offers.


Re: [Solved] Phantom records in data tables

PostPosted: Thu May 09, 2019 9:46 pm
by UnklDonald418
Would you advise manually recreating the table without the CACHED option? I could copy the data easily enough.

From the HSQL 1.8 user guide which is the version used with an Embedded database
"CACHED tables are created with the CREATE CACHED TABLE command. Only part of their data or indexes is
held in memory, allowing large tables that would otherwise take up to several hundred megabytes of memory. Another
advantage of cached tables is that the database engine takes less time to start up when a cached table is used for large
amounts of data. The disadvantage of cached tables is a reduction in speed. Do not use cached tables if your data
set is relatively small. In an application with some small tables and some large ones, it is better to use the default,
MEMORY mode for the small tables."

To get an idea of the scale they are talking about for cached tables
"hsqldb.cache_scale - 14-memory cache exponent
Indicates the maximum number of rows of cached tables that are held in memory, calculated as 3
*(2**value) (three multiplied by (two to the power value)). The default results in up to 3*16384 rows
from all cached tables being held in memory at any time."

Re: [Solved] Phantom records in data tables

PostPosted: Mon May 13, 2019 1:10 pm
by Nocton
Late to the discussion, but when I open OneTableTest.odb in OO I see 74 records - first a block with ID 1 to 40 followed by ID 1 to 34, duplicates of the first 1 to 34.
I then added a new record, this produced just 41 records but an ID of 42. Adding another record produced 42 records and an ID of 44. so after moving off 40 records the display shows correctly, but the auto increment is 2 not 1 !!

Very peculiar, but it may add something to the conclusion/solution.

Re: [Solved] Phantom records in data tables

PostPosted: Tue May 14, 2019 3:14 am
by schadecp
Interesting. I confirmed that in my database. I've already reported this as a bug and referenced this stream. I don't think the phenomenon you identified actually corrupts the data, but it sure could confuse and frustrate a user. When I enter a new record after record "58" the counter tells me there are now 60 and the new record disappears. When I hit "sort" the counter goes to 41. The new record I entered reappears, but its autonumber is now 42. Its content is correct, however, so the only thing in the data that the bug hits is the autonumbering algorithm. The take-home for me is not to enter records from a table grid when the record count is exactly 40. Which is annoying.

But actually what I think is happening is the way HSQL handles autonumbering, and not another issue with the database. As I recall, I had already attempted to add a record to the database I published. So the counter thought the next record should be 42. To confirm it, I deleted the new record and ran the following SQL:


This resets the record counter to the correct value, as if I had not inserted and deleted a couple of records. Sure enough, when I added a record, this time I ended with a record number of 41. So I conclude it's not screwing up the data table, but is in all likelihood a bug with the GUI.