Hi all
I'm creating data sets from various data sources, see here for some of them, to get an idea of what I'm doing.
http://gsociology.icaap.org/dataupload.html
My question is this:
I create a data set by taking a list of countries and it's characteristic (e.g., population in 1998) and then matching it to another list of countries and it's characteristic (e.g., infant mortality rate, 1998).
In some cases, the list of countries is from the same database, so the country names are identical, but each list may not have all of the same countries. In other cases, I'm trying to match lists of countries from two different databases, so the names of countries are not always identical.
I've been doing this by hand. Is this something that Open Office database can help me with? Or some other part of Open Office?
Thanks
Gene
Gene Shackman
The Global Social Change Research Project
http://gsociology.icaap.org/
matching records, creating database
-
- Posts: 4
- Joined: Sun Mar 02, 2008 1:55 am
- Location: USA
Re: matching records, creating database
No, OOo cannot directly help you to solve your problem.
What you face is a well known issue of many people who are working with different sources of information that may not have the exact same coding structure. Basically, to properly join information from various tables, database engines require exact matching. What I would suggest in your case is to have, for each variation of the name of a country, a table that has this name, the ISO code of the country and its official name. I would then join this table to any table containing a country name, and use the country code for the join of two different sets of information. Example:
Join table 1 and table 2 using Country Name as common field, then table 1 and table 3 using also the country name as common field, then finally the result of the first join with the result of the second join using the country code as the common field.
What you face is a well known issue of many people who are working with different sources of information that may not have the exact same coding structure. Basically, to properly join information from various tables, database engines require exact matching. What I would suggest in your case is to have, for each variation of the name of a country, a table that has this name, the ISO code of the country and its official name. I would then join this table to any table containing a country name, and use the country code for the join of two different sets of information. Example:
Code: Select all
Country Name Country Code Official Name
============ ============= ====================================
Suisse CH Switzerland
Switzerland CH Switzerland
Schweiz CH Switzerland
Country Name Population
============ ==========
Suisse 9,008,123
Country Name Mortality rate
============ ==============
Schweiz 0.0004
LibreOffice 4.2.3.3. on Ubuntu 14.04