Building a query for fragment identification

Creating tables and queries
Post Reply
Mordtret
Posts: 5
Joined: Sun Jan 21, 2018 9:23 am

Building a query for fragment identification

Post by Mordtret »

Hej,

I am pretty new to using databanks my question might have therefore a pretty simple solution that I am not able to see or next to impossible.

I have a moderatly big archaeological dataset and need to see how much of each I have. The problem is that the material is fragmented.

I recorded which parts are there: For example for a doll (I actually don´t have dolls but it is probably esier to understand):
1st) head: 100% preserved; body 50%; left arm 25%; right arm: 0%; legs:0%
2nd) head: 0% preserved; body 50%; left arm 25%; right arm: 50%; legs:100%
3rd) head: 25%% preserved; body 25%; left arm 0%; right arm: 25%; legs: 0%

I would therefore have at fragments of at least 2 dolls since the 1st and 2nd or the 2nd and 3rd fragment could be from the same doll but the the 1st and 3rd have parts of the head overlapping so must be from different dolls. This is easy to do with a small set but with several hundred fragments it can not be done manually.

Is there a way to build a query that can solve this problem for me? Or does anybody have any other ideas?

Best Björn
Open office 4.1.3 on Win 10;
eremmel
Posts: 1080
Joined: Tue Dec 30, 2008 1:15 am

Re: Building a query for fragment identification

Post by eremmel »

1. It highly depends on your data model, so please share this with us.
2. I guess that you do not only want to combine two records, but also three, four etc. This is asking for recursive SQL, but that is not possible with the embedded database that comes with Base. But if you are happy with a simple approach to test up to e.g. four levels it can be done.
3. You need to self-join the table on same species and different records then you need to filter on the sum of each part-group (e.g. head) and reject when the percentage is higher than 100%.
Based on your data mode an outline can be given.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W11 22H2 (build 22621), LO 7.4.2.3(x64)
Mordtret
Posts: 5
Joined: Sun Jan 21, 2018 9:23 am

Re: Building a query for fragment identification

Post by Mordtret »

Hej thank you for your response.

to 1) I actually recorded animal bone fragments and used a open calc tabel for it which I converted to a csv file and fed it to the database. while I recorded the fragments. Each has an individual ID, three classes (Animal Kind, bone kind and left or right in the body) and up to 23 different portions that are recorded as 0; 25; 50; 75 or 100 (the % is in the headline of the colum), plus lots of other data about the bones that is not relevant for this query.

to 2nd) I am not entierly certain that I understand that, how will it be decided which IDs are compared when he only compares up to e.g. 4 each?
Open office 4.1.3 on Win 10;
eremmel
Posts: 1080
Joined: Tue Dec 30, 2008 1:15 am

Re: Building a query for fragment identification

Post by eremmel »

1)
So your data model look like:
ID, AnimalKind, BoneKind, HandSide, Portion
or
ID, AnimalKind, BoneKind, HandSide, Portion1, Portion2, ... , Portion23

2) When you use a database that does not support recursive SQL, you need to emulate that by hand up a certain degree. For each level you will get all combinations, but you might filter out some combinations like 'level1. ID = level2.ID


Which database are you using? From all open source databases PostgreSql has the most powerful SQL syntax (I might trap now a total different discussion ...). I'm not sure if only recursive SQL is sufficient to solve your query, we might need Windowing SQL functions as well and readability will be better with CTE. Most can be implemented with regular SQL as well, but with more complexity.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W11 22H2 (build 22621), LO 7.4.2.3(x64)
Mordtret
Posts: 5
Joined: Sun Jan 21, 2018 9:23 am

Re: Building a query for fragment identification

Post by Mordtret »

It is the 2nd option with portion 1; Portion 2.... just with blanks when there is no portion 12 e.g..

I have used access 2016 so far and just tried base now. As far as google told me access has an sql implimentation, would that be sufficiant or shoudl I look at PostgreSql?
Open office 4.1.3 on Win 10;
eremmel
Posts: 1080
Joined: Tue Dec 30, 2008 1:15 am

Re: Building a query for fragment identification

Post by eremmel »

The SQL engine of Microsoft (JET-engine) is used for MS Access. Is just a blend implementation of SQL. Use only the embedded database of Base (HSQLDB version 1.8) for testing and make after each important change a backup of your Base document.
If you are not afraid for some database maintenance, I think you will be happy with a move to Postgres. You can then use the tool 'PGadmin' as an administration front end.

But for the time being we might start with Base and access.

What do you think, over how many groups can an AnimalKind be distributed? Also consider what to do when you find a mergeable match between to IDs. Will you merge them in your collection and update the one record and remove the other?

I'm thinking about some improvements, because it will be quite expensive queries to collect all those results. When you e.g. consider one AnimalKind and apply the possible unions in your collection, it might be interesting to register that no other unions exists among the other registrations of that AnimalKind. You can mark that with the highest recorded ID of your database per registration. So the next time you add new registrations you have much less to map.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W11 22H2 (build 22621), LO 7.4.2.3(x64)
Mordtret
Posts: 5
Joined: Sun Jan 21, 2018 9:23 am

Re: Building a query for fragment identification

Post by Mordtret »

I am not entierly certain that I understand your question correctly.

I do not want to merge the the data of the fragments I just need to calculate the minimal number of different bones that where in the collection. This number is used to see how many animals where at least there.
What do you think, over how many groups can an AnimalKind be distributed?
:
An animal kind have up to 1200 fragments in them, a single animal still up to several hundred, but one bone, which is what I try to calculate, probably not more than 10. (I might have used the klass term incorrectly)
I'm thinking about some improvements, because it will be quite expensive queries to collect all those results. When you e.g. consider one AnimalKind and apply the possible unions in your collection, it might be interesting to register that no other unions exists among the other registrations of that AnimalKind. You can mark that with the highest recorded ID of your database per registration. So the next time you add new registrations you have much less to map.
I might not get where you aming here but I am finished with the recording of this collection so I do not need to bring in new data for it. It would still be nice to get a proper data bank structure for other coming projects though.
Open office 4.1.3 on Win 10;
eremmel
Posts: 1080
Joined: Tue Dec 30, 2008 1:15 am

Re: Building a query for fragment identification

Post by eremmel »

Please find attached a layout of the query you have to make to collect sets of 2,3,4 that make up sub-sets of one animal. You need to extend the query and change the table and columns naming.

Code: Select all

-- Note each query needs to have the same amount of columns, so
-- when you go up to 5 levels you need to give each query ID1 to ID5.
-- Combinations of two sets.
SELECT L1.ID AS ID1, L2.ID AS ID2, -1 AS ID3, -1 AS ID4, ...
	, L1.ANIMAL, L1.HANDSIDE
	, L1.PORTION1 + L2.PORTION1 AS P1
	, L1.PORTION2 + L2.PORTION2 AS P2
	, .....
FROM BONES AS L1
	INNER JOIN BONES L2
		ON L1.ANIMAL = L2.ANIMAL
			AND L1.HANDSIDE = L2.HANDSIDE
			-- only join ID 3 with ID 5 and not 5 with 3.
			AND L1.ID < L2.ID
WHERE
	L1.PORTION1 + L2.PORTION1 <= 100
	AND L1.PORTION2 + L2.PORTION2 <= 100
	AND L1.PORTION3 + L2.PORTION3 <= 100
	....

UNION ALL

-- Combinations of three sets.
SELECT L1.ID, L2.ID, L3.ID, -1, ...
	, L1.ANIMAL, L1.HANDSIDE
	, L1.PORTION1 + L2.PORTION1 + L3.PORTION1 AS P1
	, L1.PORTION2 + L2.PORTION2 + L3.PORTION1 AS P2
	, .....
FROM BONES AS L1
	INNER JOIN BONES L2
		ON L1.ANIMAL = L2.ANIMAL
			AND L1.HANDSIDE = L2.HANDSIDE
			-- only join ID 3 with ID 5 and not other permutations.
			AND L1.ID < L2.ID
	INNER JOIN BONES L3
		ON L2.ANIMAL = L3.ANIMAL
			AND L2.HANDSIDE = L3.HANDSIDE
			-- only join ID 3 ,5, 20 and not other permutations.
			AND L2.ID < L3.ID
WHERE
	L1.PORTION1 + L2.PORTION1 + L3.PORTION1 <= 100
	AND L1.PORTION2 + L2.PORTION2 + L3.PORTION2 <= 100
	AND L1.PORTION3 + L2.PORTION3 + L3.PORTION3 <= 100
	....

UNION ALL

-- Combinations of four sets.
SELECT L1.ID, L2.ID, L3.ID, L4.ID, ...
	, L1.ANIMAL, L1.HANDSIDE
	, L1.PORTION1 + L2.PORTION1 + L3.PORTION1 + L4.PORTION1 AS P1
	, L1.PORTION2 + L2.PORTION2 + L3.PORTION1 + L4.PORTION1 AS P2
	, .....
FROM BONES AS L1
	INNER JOIN BONES L2
		ON L1.ANIMAL = L2.ANIMAL
			AND L1.HANDSIDE = L2.HANDSIDE
			-- only join ID 3 with ID 5 and not other permutations.
			AND L1.ID < L2.ID
	INNER JOIN BONES L3
		ON L2.ANIMAL = L3.ANIMAL
			AND L2.HANDSIDE = L3.HANDSIDE
			-- only join ID 3 ,5, 20 and not other permutations.
			AND L2.ID < L3.ID
	INNER JOIN BONES L4
		ON L3.ANIMAL = L4.ANIMAL
			AND L3.HANDSIDE = L4.HANDSIDE
			-- only join ID 3 ,5, 20, 100 and not other permutations.
			AND L3.ID < L4.ID
WHERE
	L1.PORTION1 + L2.PORTION1 + L3.PORTION1 + L4.PORTION1 <= 100
	AND L1.PORTION2 + L2.PORTION2 + L3.PORTION2 + L4.PORTION2 <= 100
	AND L1.PORTION3 + L2.PORTION3 + L3.PORTION3 + L4.PORTION2 <= 100
	....


UNION ALL

	.....

It's Microsoft marketing that tells you computers are qualified for non-technicians
W11 22H2 (build 22621), LO 7.4.2.3(x64)
Mordtret
Posts: 5
Joined: Sun Jan 21, 2018 9:23 am

Re: Building a query for fragment identification

Post by Mordtret »

Thank you very much!

I will try it and play with it for a while and than come back with my results to you
Open office 4.1.3 on Win 10;
Post Reply