Page 1 of 1

[Tutorial] External Links In Calc

Posted: Fri Mar 07, 2008 2:58 am
by Villeroy
5 methods to link data in a spreadsheet. Methods 1 and 2 import whole sheets, 3 and 4 import ranges or html-tables from web pages, method 5 imports tables and queries from databases or from sources that can be seen as databases. In chapters 1 to 4 source file refers to a file from where we export data, whereas target file refers to the file which imports data one way or the other.

1. Reference By File-URL:
Import hidden data-sheets from source spreadsheet files, dBase and text tables (*.csv) by URL-references ='file:///path/name.xls'#$SheetX.$A$1
The Easy Way To URL-Links: If you don't want to type long formulas, open both documents, start a formula with '=' and use the mouse to point a range in the source document. The reference can be part of any formula or range name. The source document has to be saved to disk.

How It Works: 'file:///path/name.xls'#$SheetX.$A$1 actually refers to cell A1 on a hidden sheet named 'file:///path/name.xls'#$SheetX. The hidden sheet will be created on the fly when the calculation cascade stumbles upon such a reference. Works with text files (*.csv) as well when you first open the csv and use the above mentioned pointer-method. Manual update via Menu:Edit>Links...[Update] will re-read the data copy from the referenced file into the hidden sheet.

##########################################################################
 Edit: 2009-05-17: Version 3.1 does not use hidden sheets anymore 
http://development.openoffice.org/releases/3.1.0.html
Release notes of v3.1 wrote: Storage of data pulled in by external references changed
http://www.openoffice.org/issues/show_bug.cgi?id=92797
feature-info:
The implementation of how data pulled in by references to external documents in formulas is stored internally was changed. Previously, all data of a referred external sheet was stored in a hidden sheet. Now only the data referred is stored in a separate structure, hidden sheets are not created anymore. This has the advantages that - By not creating hidden sheets the amount of available 255 sheets is not decremented. - It should be possible to refer more than 254 external sheets, given enough memory for the amount of data of course. - Less data needs to be stored when saving the document to file. - No unintended data leakage may occur because data not referred is not saved to files. Note that the external sheets will not show up anymore in the Navigator. For ODF file storage the data is still saved as external sheets, previous versions will be able to read the document. Re-saving an already existing document with the new version may result in a significantly decreased size of the file's storage, depending on the previous amount of data in the entire sheet and the amount of data actually used in formulas.
##########################################################################
 Edit: 2009-08-22: OOo 3.1 had several difficulties with the new implementation that had been fixed in 3.1.1 
Commands And Settings: Menu:Edit>Links... [Update | Modify... | Break link].
Menu:Tools>Options...Calc>General>'Update links when opening' [Always | Never | Ask]
Menu:Format>Sheets>Show... displays the hidden sheet. It contains values only, no formulas. When you update the link, the application tries to re-read the values from the source file into this document's hidden sheet. Since data are read from disk you should have saved a possibly loaded source file before update.

This is the preferred way of linking when you want to embed all the used sheets of your calculation model. There is no need to pass over all the linked files when you send your calculation model to someone else. However, the recepient must not refresh the link, since this would clear the hidden sheet. If this happened by accident, reload without saving.

Quirks:
- Contrary to Excel, you can not use range names defined in the source document. However, you can create names in the target document referring to external sheets. Use a range name referring to
='file:///path/name.xls'#$SheetX.$A$1
instead of reference
='file:///path/name.xls'#RangeName
- When you modify a link in the dialog (Edit>Links...), you will be prompted for a file, but there is no option to specify a sheet name. Thus the newly linked file must contain a "Sheet1" if the old link referenced to "Sheet1" in another file.
- You can not break a link until you have removed all the references. As long as there is a single reference 'file:///path/name.xls'#$SheetX.$A$1 the next calculation cascade will re-establish a link with hidden sheet 'file:///path/name.xls'#$SheetX.
- Even if the URL looks like an absolute one it is a relative URL. You can't move both files together into another file hierarchy without breaking the link.
 Edit: For relative links refer to the attachement at the end of this tutorial 
 Edit: Edit 2009-10-05, v3.1: Even if the URL looks like an absolute one it is a relative URL. Now we can move both files together into another file hierarchy without breaking the link. 
Clean Up: Use the Find&Replace tool to search for all references within formulas. Comment out the respective formulas (prepend a single quote or remove the =). Check named references as well (Ctrl+F3). Call Menu:Edit>Links...[Break Link]. Delete the hidden sheet or move it out of the file.
Hint For API Users:
http://api.openoffice.org/docs/common/r ... SheetLinks (c.s.s.sheet.SheetLinkMode.VALUE)
 Edit: Since v3.1: http://api.openoffice.org/docs/common/r ... Links.html 
2. Linked Sheet From File
Import visible sheets from spreadsheet files, dBase and text tables (*.csv).
This is the preferred method to import calculation models (formulas) if you are aware of the details described below.

Menu:Insert>'Sheet From File...' with option "Link". Without option "Link" this feature does the same thing as copying a sheet from the source file through Menu:Edit>Sheet>Move/Copy...
Commands And Settings: Menu:Edit>Links... [Update | Modify... | Break link].
Menu:Tools>Options...Calc>General>'Update links when opening' [Always | Never | Ask]

How It Works: Contrary to the above URL-method this method imports a visible sheet including formulas. The following is important to know when you get #REF! errors because references can not be adjusted accordingly.
If the sheet references other sheets in the source file, the references in the new file will adjust by means of sheet positions.
If the source file's first sheet is named "Sheet1" and the target document's first sheet is named "First", then $Sheet1.A1 will adjust to $First.A1.
A relatively referenced sheet Sheet1.A1 will adjust to a sheet having the same position relative to the position in the source file. The same principle applies to range names (name=$Sheet1.A1 becomes $First.A1).
For instance, you will get #REF! errors when you import a sheet to the first position if the sheet contains references to the fist sheet. Same with relative references to the preceeding sheet if there is no preceeding sheet in the target document.
Hint For API Users:
http://api.openoffice.org/docs/common/r ... SheetLinks (c.s.s.sheet.SheetLinkMode.NORMAL)

3. Import External Data By Name
Import embedded tables by name. Works with tables in html sources and with named ranges in spreadsheet documents. This seems to be the only method to import html-tables from web-pages through the GUI.
Menu:Insert>Link To External Data... You can set a time interval for automatic updates.
Commands And Settings: Menu:Edit>Links... [Update | Modify... | Break link].
Menu:Tools>Options...Calc>General>'Update links when opening' [Always | Never | Ask]
How It Works: Named ranges from spreadsheet sources seem to work like the whole sheets in the previous chapter.
Quirks
- It can be difficult to choose the right html-tables from the dialog since the are offered by generic names (table_1, table_2,...). This may lead to some try-and-error experiments. Studying the html source may be another option.
- When importing html there seems to be no way to prevent number recognition.
 Edit: Since version 4 we can specify the local context of the import and if "special numbers" should be interpreted. "Special numbers are dates, times and any other figures consisting of more than digits and decimal point/comma. 
Hint For API Users:
http://api.openoffice.org/docs/common/r ... #AreaLinks

4. Cell Function DDE
Import values from source spreadsheet files. Other types of sources seem to fail (tried dBase and csv).
The Easy Way To DDE-Links:
- Copy a range from the source file
- Select a cell in the target file and call Menu:Edit>'Paste Special...'
- Check option "Link"
This will produce a formula like
=DDE("soffice";"C:\path\name.xls";"reference";0)
See online help on "DDE function".
- Server name "soffice" should work within OpenOffice.org. I don't know which other server names may work under MS Windows. Since there is no DDE in other operating systems, OOo comes with it's own implementation of DDE.
- The file argument needs to be a string in system notation rather than file:///-URL.
- The reference can be a string like "SheetX.B1" or a named reference of the source file. Caution: "B1" always refers to B1 on the source file's first sheet!
- The numeric fourth argument is optional and has this meaning:
0 or missing: Number format from this document's "Default" cell style
1: Data are always interpreted in the standard format for US English
2: Data are retrieved as text; no conversion to numbers

Commands And Settings: Menu:Edit>Links... [Update | Modify... | Break link | Manual or Auto Update].
Menu:Tools>Options...Calc>General>'Update links when opening' [Always | Never | Ask]
How It Works: The source document gets loaded invisibly, if it is not loaded otherwise. Since the source file is kept in memory, automatic update is possible. When you edit the source range, the related results in the target file update. You can switch to manual update in dialog Menu:Edit>Links...
Quirks:
- You can not easily use this type of link on other machines. Keep a central reference to the source file's path, so you can edit a single cell to adjust a link's path. Modifying the link in the links-dialog has a temporary effect until the DDE function recalculates.
- The count of linked cells to be imported at the same time seems to be limited to some thousands of cells. I don't know the details. An array formula {=DDE("soffice";$A$1;"Sheet1.$A$1:$A$65536")} works until row 32747 (A1 contains the path).
- Yes, spreadsheet function DDE can connect to Writer tables -- somehow, if we try hard enough: http://www.oooforum.org/forum/viewtopic.phtml?t=61950
Clean Up: Use the Find&Replace tool to search for all DDE formulas containing the unwanted file path. Comment out the respective formulas (prepend a single quote or remove the =). Check named references as well (Ctrl+F3). Finally, call Menu:Edit>Links...[Break Link].
Hint For API Users:
http://api.openoffice.org/docs/common/r ... l#DDELinks

5. Import From Registered Datasources
This is the method to import tabular data from various external sources, including external database servers. You may even import other spreadsheets via datasource into another spreadsheet when you want a sub-set of sheet-columns ordered by more than 3 sort fields.
A Digression Concerning Base And Registered Datasources.
A registered datasource is a Base document which exposes it's tables and queries to other office components (currently Writer and Calc). Datasources can be imported through drag&drop from the datasource window (shortcut F4) or through various tools such as Writer's mail merge and Calc's data pilot. You can manage database registrations in Menu:Tools>Options...Base>Databases. Unregistered Base documents can be used as such, but they are not exposed to Writer and Calc.
A Base document may be connected to one external source of data such as Oracle, MySQL, PostgreSQL, MSSQL, MS Access, dBase or even csv and spreadsheets. Thus your flow of data could be outlined like this:
[external_source] > Base.odb(registered) > Spreadsheet.ods or Writer.odt
"external_source" could be anything which exposes it's tabular data to the Base document. If the Base document is of the self-contained default type "embedded hsql" then all tables are saved within the Base.odb file without external source.

Exploring the Bibliography Example:
- While in Writer or Calc, hit F4 and unroll the example database "Bibliography". It contains "Tables" and "Queries". The only existing element is a table named "biblio".
- Call "Registered databases..." from the context-menu of "Bibliography" in order to see which odb file is exposed by this registered datasource: "Bibliography" represents a database document stored as <ooo_user_dir>/user/database/biblio.odb. Close the dialog.
- Call "Edit database file..." from the context-menu of "Bibliography". This opens the databse document biblio.odb.
- The database has one element "biblio" in it's tables container.
- The status bar in the database's main window indicates an external source of type "dBase" in directory <ooo_user_dir>/user/database/biblio/. Menu:Edit>Database>Properties... shows more details about the connections.

How To Import A Linked Copy "Bibliography"
- Drag table "biblio" from the datasource window's left pane onto a Calc cell on a unused sheet.
- Edit the table in the datasource window's right pane.
- Click a cell in the imported cell range and call Menu:Data>Refresh. The refreshed import range will reflect the changed datasource.

Commands And Settings:
The refreshable import range is a named database range named "Import1". Call Menu:Data>Define..., select the import range and hit button "More..."
The additional options are:
- Contains header [always true for import ranges]
- Insert/delete cells [adjacent formula cells at the right or at the bottom will be adjusted to the size of the database range]
- Keep formatting [of the sheet cells]
- Don't save import data [save disk space, enforce manual refresh after opening the file]
A string at the bottom of the dialog indicates the source.

Menu:Data>Select... lets you select this document's database ranges.
Menu:Data>Refresh refreshes a selected database range (a single cell within a range suffices as well).
Menu:Data>DataPilot>Start... Option:"Datasource registered in OpenOffice.org" lets you create a data pilot (aka pivot table) from a datasource table or query.

Quirks
- Datasources require some setup before you can use them on other systems. The setup details and the feature set depend on the type of datasource (native, embedded hsql or some external type).
- If you like to have other names than "Import1", "Import2",... for your import ranges then create a one-cell database manually before dragging an object onto the cell.
- Base supports parameter queries where the user has to provide the arguments through a dialog, but there is not method built-in to substitute parameters with cell values. The following document provides a set of macros to overcome this limitation: http://user.services.openoffice.org/en/ ... php?id=248
- When you create a data pilot (aka pivot table) from data source neither time fields nor time stamps get imported. http://www.openoffice.org/issues/show_bug.cgi?id=81848
Hint For API Users:
http://api.openoffice.org/docs/common/r ... l#DDELinks
http://api.openoffice.org/docs/common/r ... iptor.html
http://api.openoffice.org/docs/common/r ... Range.html
Data pilots created from data sources are not accessible through the API.

####################################################################################################################
 Edit: 2008-08-11: Attaching an example for relative linking. 
 Edit: 2009-08-22: The attached example can not work with v3.1. I attached a new version for v3.1.1 where relative linking by means of INDIRECT(ADDRESS(...)) works as expected with single values and arrays 
Extract the pair into the same directory. The path-name must not contain single quotes. Open target.ods and follow the formulas and annotations from top to bottom. Refer to chapter 1 of this tutorial.

Re: [Tutorial] External Links In Calc

Posted: Sat Mar 15, 2008 8:51 pm
by James
This post takes the topic off the unanswered list.