[Solved] Relationship design - one-to-one or giant table?

Creating tables and queries
Post Reply
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

[Solved] Relationship design - one-to-one or giant table?

Post by Doranwen »

I've been trying to design a database for a language/education project and for some reason am struggling to understand how to set it all up. (My fanfic database last summer was far more complicated but it made a lot more sense to me, lol. Go figure.)

The main table would have a varchar field called Word (it'll most likely be unique, but I'm told that using anything other than a numeric ID is a bad idea, so I figured I'd create a WordID field as the primary key with autoincrement). There are a couple additional fields which do not require data and are unique to it - an integer field, a varchar field, and a couple Boolean fields. The rest of the fields in the entire database are all Boolean ones - *250+* of them - and they all relate to the Word field directly. Technically, it's simple enough to do it all in a spreadsheet, relationship-wise, but a spreadsheet couldn't possibly display them all in any easy-to-enter manner and a query with a bunch of =0 for a large number of the Boolean fields would be far simpler than doing autofilter a hundred times, especially once I've done one or two of them and can re-use much of the data.

At first I thought of creating one enormous table for all of it - but I discovered a) a gui limit to table fields (meaning I would have to enter the rest via SQL), and b) read various things stating that one enormous table might be less efficient than a set of related smaller ones.

There are four logical subsets for the 250+ Boolean fields, so I can break them all up into that. Then the question becomes: How do I relate them to the original table? It would be a one-to-one relationship, NOT one-to-many at all. Every entry in Word would have 0 by default for every one of the Boolean fields, and 1 only if I check the box for it.

At this point I'm not sure whether I should be trying to stick them all in one table anyway (and set up a long SQL command or set of commands to insert field after field), or whether I should be setting up five tables - the main one and the four subset tables, and set up relationships somehow (and how?). Thoughts?
Last edited by Doranwen on Mon Nov 09, 2020 1:30 am, edited 1 time in total.
LibreOffice 5.1.6.2 on Linux Mint 18.1
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Relationship design - one-to-one or giant table?

Post by Villeroy »

Very wide tables indicate a many-to-many relation.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

For each record with the Word field the many checkboxes are a matter of identifying "yes" or "no" if they have a particular element in them. 250+ times. What's the best way to handle that? I cannot possibly create 250+ tables, one for each element - or worse, 500+ tables, with a junction table for each element to go between! Besides, as they are Boolean and not any other field, there's only two options - yes/no - rather than a variety of possible options and relationships.

The fact that a spreadsheet *could* handle this, and could handle it somewhat easily, if it weren't a royal pain to scroll and apply autofilter on, indicates that many-to-many is perhaps not the setup I want to use… but what IS, is the question.
LibreOffice 5.1.6.2 on Linux Mint 18.1
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Relationship design - one-to-one or giant table?

Post by Villeroy »

It is a many-to-many relation. A spreadsheet can not handle this properly. A database that is designed like a spreadsheet can't handle this neither.
In the database [Example] Relations reflected by list boxes in forms there is a one-to-many relation between animals and persons where each person has one species of animals. And there is a many-to-many relation between persons and things where each thing can belong to many persons and vice versa.
You have a list of persons, a list of things and a linking table storing the information which persons and which things are associated to each other. Base provides one method to edit the data of a many-to-many relation where a subform grid shows all the related items of the selected item in the parent form. The grid is made of list boxes showing textual information (names) related to the stoerd ID numbers.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

I could design a spreadsheet right now with all those 250 fields as columns, and each cell I could enter either 0 or 1 for it. A spreadsheet would handle it quite fine! It would just be a bear to scroll so far right to enter a 1 in all the columns I want to. It would also be a pain to autofilter *out* 100 of the columns. (A database does queries much better, and, as I said, I could reuse query language to make it more efficient.) But it's not like a spreadsheet couldn't do it. It would just not be anywhere as handy as a database.

I looked at the list box in your example. A list box is not going to be anywhere near easy and efficient enough for me to mark the elements. Scrolling in a dropdown is FAR more inefficient than simply checking a box in the correct places, given the total number of fields in question. There are some smaller subgroups I could break the elements up into, but there are so MANY of them that it would be impossible to present them all in a form where I could select them - there's just not enough window space for it. This is a situation where a word either has or doesn't have any of 250+ elements in it! The sheer number of elements makes a listbox an impossibility. Each of the four main subgroups has anywhere from 13 (the smallest, and reasonable enough for a dropdown) to *121* different Boolean fields in it. Scrolling down to find a specific thing in a list of 121? Please no, lol.

So I'm still stuck at - how can I set this up to be reasonably efficient - both in the data storage AND in the user entry. Checkboxes are a MUST; there's no question about that. It's how to set up the table relationships and keep them reasonable so that a single form can be used to enter everything that is trickier. The only way I've ever learned to set up many-to-many relationships involves a junction table, and I don't see how one can use checkboxes with a junction table. I've tried to think through how to set up a table with this and it still doesn't make any sense to me, no matter how I look at it. :/
LibreOffice 5.1.6.2 on Linux Mint 18.1
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Relationship design - one-to-one or giant table?

Post by Villeroy »

You need some database development suite and invest a lot of time if you want to represent attributes of a many-to-many relation by means of check boxes. LibreOffice Base can't do that unless you write macro routines for every click on such check box.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

Since that's not feasible (I'm barely learning how to use Base and macro writing is beyond me), how terrible is it to put all the checkboxes into a few linked tables without using a junction table, given that it's Boolean - everything's either 1 or 0? (Null isn't an option here.) I've had Boolean fields in my other database and no one at the time suggested that the existence of a Boolean field meant that it was a many-to-many relationship, even though clearly multiple records would have a 1 or a 0 in them. Maybe it's not the most efficient way of doing it. But it's certainly not impossible, and given that lots of other databases have one or two Boolean fields in there without anyone suggesting they need to be split into a junction table of some kind for a many-to-many relationship, this has to be an option as well.

I can easily set the four subgroups up as separate tables (assuming I'm using the primary key from the Words table to link them as if it's a one-to-many relationship? definitely am trying to figure that one out right now). I assume that since everything must either have this element or not, that I want the default to always be No, not null. The idea is, for every entry into Word, all of the other fields would automatically get a 0 in them, until I check boxes to mark some as 1. The real question is, how do I set things up so that every entry automatically gets populated with those 0s for each field, just by entering something into the Words table's fields?
LibreOffice 5.1.6.2 on Linux Mint 18.1
UnklDonald418
Volunteer
Posts: 1544
Joined: Wed Jun 24, 2015 12:56 am
Location: Colorado, USA

Re: Relationship design - one-to-one or giant table?

Post by UnklDonald418 »

Look at this example as a possible solution to your problem, without all those Boolean fields
Demo53_ListBoxes.odb
(22.25 KiB) Downloaded 351 times
If your problem has been solved, please edit this topic's initial post and add "[Solved]" to the beginning of the subject line
Apache OpenOffice 4.1.14 & LibreOffice 7.6.2.1 (x86_64) - Windows 10 Professional- Windows 11
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

Unfortunately, anything *other* than checkboxes would make the data entry part far less efficient - because there are so *many* of them that dropdowns would take MUCH longer to enter the data than just checking boxes. (It's simply impossible to generate enough dropdowns in logical groupings that I wouldn't have to scroll, with the screen space. There are at *least* 50 or 60 separate mini-groupings, if you go beyond the four main groups. There aren't any in between groupings to that.) And I only want to see one record per form - one word, and all the checkboxes marking which elements match. I'm not much likely to be browsing them after the fact - the point of this is so I can set up queries that will pull lists that match my criteria, but there will be a ton of data entry to get to that point, and anything other than checkboxes would add tons of time and frustration. I appreciate your efforts to help, but it worked out better for me to go with the following, even if it's not best practice (in this case, best practice would have been either terrible data entry options, or extraordinary amounts of work at a skill level far above my current one - simply not feasible):

I ended up turning it into a giant table after all, lol - used SQL to add all the columns in one batch and removed the other tables I was testing out. (You have to do each ALTER TABLE line first for every ADD COLUMN since Base doesn't let you enter multiple ADD COLUMN lines with a single ALTER TABLE as some would, but I was able to use a Linux command to generate the ADD COLUMN part - including DEFAULT '0' - with a txt file of the column names only, and then a copy and a ton of pasting to set it up for all of them, but then I was able to drop it all in at one shot.) And it works like I needed it to. Clumsy, perhaps, but I really don't see another way to preserve the checkboxes - which are absolutely essential when it comes to the data entry part, as that's going to take me a ton of time - especially since I could not for the life of me figure out how to use the primary key in my Words table to populate any other tables with entries, even with default 0 for all of them.

Now all I have to do is figure out how to dump in the first few thousand words and one associated numeric field to go with so I can start marking the elements, and split the database so it's not embedded anymore (possibly not in that order).
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Relationship design - one-to-one or giant table?

Post by Villeroy »

You know that you can type into a drop down in order to find entries a lot faster than scanning by eyeball? There is also a tall representation of a list box where you see neighbouring entries while typing.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Relationship design - one-to-one or giant table?

Post by Villeroy »

The attached demo database is based on a list of persons. Persons can be picked from a listbox by typing the family name into a listbox. The listbox entries are defined in query "qPersonsListbox":

Code: Select all

SELECT "SN" || ',' || "FN" || ',' || "BD" AS "Visible", "ID" FROM "Persons" ORDER BY "Visible" ASC
which gives visible listbox entries like this

Code: Select all

Obreien,Jolie,1935-03-04
and an invisible bound column "ID" entering the person's ID number into some data field.
Input form "Listboxes" shows a tall list box for a single entry and a table control with multiple listboxes. Only the table control with listboxes is suitable for many-to-many relations.
I used such listboxes to look up one of thousands of different persons. In case of duplicate family names you can type a comma and continue typing with the forename. In case of duplicate full names, type another comma and the year of birth, a dash and the month of birth, a dash and the day of birth. Just type in data until you found the person you are looking for.
Attachments
DummyPersons5.odb
Listbox demo
(78.21 KiB) Downloaded 343 times
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

Typing in a listbox means I have to remember how I formatted the field or something - it's still WAY slower than the checkboxes, lol. Most of these elements are tiny names but they're not always as easy to remember the name (nor would they be that easy to remember any other version of formatting). The checkboxes on the form, on the other hand, are ordered in a logical format with those little groupings that makes it easy to find. By the time I remembered what to type, my eyes would already be in the right place for the checkbox and I'd have clicked it. Half the value of that method is *seeing* all the options right there so I can choose the correct ones.

Either way, no one's explained just *how* to divide up 250+ Boolean fields into tables some other method. Are you trying to suggest that all of the fields be in some other table together? That they should be something other than Boolean? While an example may be nice, I'm not intuiting from looking at the example databases what you're trying to say (other than "use a listbox"), and I probably won't at this point. I'd need a clear verbal explanation of what fields I'd be replacing them with, what tables they'd be divided into, and what relationship those tables would have. But it's a moot point - since anything else would still be forcing me to try to use a listbox for a data entry method where a listbox will slow me down greatly.

What is using a giant table of Boolean fields going to do to the database, especially if I never go beyond 10k records or even there? (It's unlikely I would.) Slow it way down?
LibreOffice 5.1.6.2 on Linux Mint 18.1
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Relationship design - one-to-one or giant table?

Post by Villeroy »

Typing in a listbox means I have to remember how I formatted the field or something
You should remember the first characters of the attribute name. The attribute name is stored as a value. The user is not confronted with field names. And the full list of attribute names can be displayed on the form as plain text or in a tall list box for easy lookup.
it's still WAY slower than the checkboxes,
If you work with the form on a daily basis, nothing is faster than lookup by typing.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
RoryOF
Moderator
Posts: 34586
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Relationship design - one-to-one or giant table?

Post by RoryOF »

It might not be out of place to suggest that such lookups be done on a case insensitive basis, no matter what output formatting is defined for the text.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Relationship design - one-to-one or giant table?

Post by Villeroy »

Case sensitivity can be adjusted on a per-column basis. There is a special data type for case-insensitive text. However, what you type into a listbox is treated case-insensitvely anyway.

The checkbox approach is not maiintainable. Whenever a new attribute comes up you have to change the designs of the table, all queries where attributes are involved, input forms and reports. Attribute names in a table are added within seconds and they are usable after reloading the form(s).
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

Villeroy wrote:The checkbox approach is not maiintainable. Whenever a new attribute comes up you have to change the designs of the table, all queries where attributes are involved, input forms and reports. Attribute names in a table are added within seconds and they are usable after reloading the form(s).
The main sets of elements - the 254 ones that will be involved in nearly all queries - are already set and won't be changed. I've done enough extensive work with them in the forms of text file lists (realizing that if I switched the order of things, I'd have to switch the lists all over the place, is what caused me to go "I need a database for this instead") in progressive orders that there *can't* be any further additions - they're finite. The only thing I might possibly add are Boolean fields for general categorization in a couple specific groups, and they wouldn't affect any existing queries (and would cause only the most minor of changes to a form). And I already thought of the two I can come up with and am not likely to add many further ones.

I'm still trying to understand your listbox suggestion, since it's never quite been explained *HOW* the tables are to be set up. Are you saying you think I should create one table for the words, one table for each group of elements (and the smallest sets I could go down to would be the 64, 122, 55, and 13, because anything smaller would be a lot of 1s and 2s among the 6s and 7s, and that would be a truly ridiculous number of tables and listboxes), and a junction table between each table of elements and the words table? And somehow set the listboxes to show the element descriptions instead of the field names?

Also, as far as queries go, I would be doing a lot of *exclusions*. Like, excluding 100+ elements at a time. Very very little would be "everything that DOES match element X". With Boolean fields, it's easy to say "elementA=0" or "elementB=0" a whole bunch. But from my setting up of queries for the last database I created, excluding stuff from a query involving junction tables got VERY very complex. Complex enough that I couldn't create it myself and had to have someone provide me the SQL. It also used a ton more SQL to exclude a single thing than to match on AND stuff. So it would end up being a lot more work to create those types of queries, unless you know a simple way to say "exclude these 100 elements". As far as I know there's no way to say "pull only things which include one or more of this pool of elements and *nothing* outside them".
LibreOffice 5.1.6.2 on Linux Mint 18.1
chrisb
Posts: 298
Joined: Mon Jun 07, 2010 4:16 pm

Re: Relationship design - one-to-one or giant table?

Post by chrisb »

hello Doranwen,

in january 2020 a user opened a thread with similar objectives: viewtopic.php?f=39&t=100832
briefly the db contained many text strings which described a number of tasks which needed to to be carried out at various times of year.
the data i assumed would be presented as text, a checkbox would signify completion & the data would be digitally logged.
that db was far more complex than yours but posed an identical issue, how do we present & store such data?
i found conventional methods to be slow & clumsy, the solution i settled on was to pre-generate the data tables using macros, everything was handled by macro & it worked really well.
i never made that db available because the OP was very slow to respond & i felt unappreciative.

attached is a small db, it differs from the above in that the above would receive input for every field whereas yours will not. it is therefore inefficient & under normal circumstances i would not recommend it's use but if ease of input is the goal & you want checkboxes then it's worth consideration.

the elements are split into four categories which are named for simplicity of coding (do not rename the tables).
three small macros are used to insert data into the the log tables so that we can use check boxes.

this setup is inflexible, once you generate the logs for a word then a record is created for every record held in the section (boolean text) tables & it's not possible to further add or delete elements, therefore it's wise to ensure that all possible elements which relate to a particular word are entered into the section tables before the logs are generated.

if a word is deleted then all related logs are also deleted.
please don't think that i expect you see this as a solution, your issue may appear trivial but that really is not the case.
RelationshipDesign.odb
(26.3 KiB) Downloaded 332 times
open office 4.1.14 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.6.0 (Split) on Windows 10
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

I appreciate the offer! I'm not sure this will work all that well, as it's *possible* that I might forget to enter something and want to come back to it, so that limitation could be an issue. I'm used to being able to go edit records (though in the last one that I created, some of the junction table records I don't have an easy way to delete, so I might have to figure out an alternate form just for deleting stuff so I don't have to scroll tables for it once it gets any bigger than it is now).

Even more than the checkboxes (which I could get over, as much as I do prefer them) is the concern over queries. When I look at it, the real problem is that even a database, even the way it's being suggested with listboxes, isn't really set up to request data in the way I'm needing it. I can see how to set up long - but ultimately not too intricate - queries to pull the data necessary, but I have to use a whole lot of "where this element = 0" (in proper SQL, of course), which is probably inefficient but will do the trick, if I think through it all. In reality, assuming a pool of these 250-some elements, if I want to pull just records matching 4 elements, what I actually want is NOT "everything with any of the 4 elements in it", I actually want "anything that has only some or all of these 4 elements in it, and nothing else". A word with elements A and B but not C or D *should* be in the results, but a word with all four elements *and* E as well should *not* be in the results. I fear that changing the structure to multiple junction tables will make creating SQL queries to pull the exact combinations that I want (particularly if I'm excluding well over 200 elements) an absolute *nightmare*! And I would need to create at least several hundred unique ones at some point…

EDIT: Looking at the queries I had to come up with here, I'm less concerned about the queries: viewtopic.php?f=61&t=98825

Apparently excluding multiple IDs is even simpler? The one user suggested that it's possible to just list the numbers of their IDs separated by commas in the one spot? Given that, I do think that the listbox method may prove easier to work with, then, despite some frustration in how I'll have to manage entry and set up a form. So let's see if I can figure out a table setup that works for that…
LibreOffice 5.1.6.2 on Linux Mint 18.1
UnklDonald418
Volunteer
Posts: 1544
Joined: Wed Jun 24, 2015 12:56 am
Location: Colorado, USA

Re: Relationship design - one-to-one or giant table?

Post by UnklDonald418 »

One thought on handling sub-groups would be to add a SubGroup field to the 4 group tables.

A filter table might help with data entry on your form(s)
[Example #3] Filter/Search with Forms (leveraging SubForms)
If your problem has been solved, please edit this topic's initial post and add "[Solved]" to the beginning of the subject line
Apache OpenOffice 4.1.14 & LibreOffice 7.6.2.1 (x86_64) - Windows 10 Professional- Windows 11
Doranwen
Posts: 83
Joined: Sun May 26, 2019 6:46 am

Re: Relationship design - one-to-one or giant table?

Post by Doranwen »

I tried to read that but my eyes glazed over, lol.

A subgroup field would help somewhat, maybe, but with the one table I'm looking at which has 64 elements… it's 25 subgroups, eight of which have only one element in them, and only two are as high as 6 elements. So I'm not sure how useful they would be, since the attribute name would automatically begin with the subgroup name allowing for fairly fast data entry, relatively speaking. And the 13 element table has almost no useful subgroups at all - nearly everything on there is in 1s. The 122 element table *might* have some use - all of those are at *least* 2s, up to subgroups of 8, but I was not following the process whatsoever, unfortunately, and if I'm consistent in the naming scheme, I should be able to come up with something I can type to filter down, even if it's not as quick for me as checkboxes. The queries are more of a concern now, as queries based on filtering 200+ Boolean elements vs. excluding 200+ ID numbers in a comma-separated list makes the latter far simpler and well-worth any annoyances in adjusting to the initial data entry, now that I've looked at what those entail.

So as far as how the general design of the structure goes, I *think* I have that figured out and can mark this solved. (I'll have to look at the sample db with the listboxes way above to see if I can figure out how to set up the listboxes properly, but that's a Form question and not a Tables & Queries one.)
LibreOffice 5.1.6.2 on Linux Mint 18.1
Post Reply