- It opens up calc to display the spreadsheet. I would prefer for the code to just run in the background.
- It appears that a cell contains either a double or a formula.
- Have I missed an interface?
- Is there a method to determine what the formula represents?
- Dates & times appear to be represented as doubles in the same way as lotus & excel
- I grabbed the code from Jakarta POI to convert the double to a date.
- Is there an OOo equivalent?
- Is there a better way to approach this problem?
- To be honest what I am looking for is the ODF equivalent of Jakarta POI
- Can use see any other obvious areas for improvement?
Reading contents of spreadsheet - best practice
Reading contents of spreadsheet - best practice
I have made my first attempt at writing some java code to extract the contents of a spreadsheet. However, I am not particularly happy with the code as it stands, so I was hoping that some people here could offer some advice.
Re: Reading contents of spreadsheet - best practice
As far as I can see on the first glace, POI tries it's best to read (and write?) binary office files on a server, particulary without having MSOffice installed. You are using the office API so it is not comparable to POI.4. Is there a better way to approach this problem?
* To be honest what I am looking for is the ODF equivalent of Jakarta POI
Well, everybody loves to read MSOffice files. Since OOXML has been acknowleged as ISO standard there are strong effords to develop toolkits for the new shit.
However, there is a promising project related to ODF: http://odftoolkit.openoffice.org/. Unfortunately, it has been promising for quite a long time now. Personally I believe that Microsoft is going to win another format race.
Search this forum or the old one http://www.oooforum.org/forum/search.phtml for "headless"1. It opens up calc to display the spreadsheet. I would prefer for the code to
just run in the background.
A cell has a formula, a value and a string.2. It appears that a cell contains either a double or a formula.
* Have I missed an interface?
* Is there a method to determine what the formula represents?
- The formula is a string, stored within the file in English notation. It is localized in the GUI: c.s.s.table.XCell.getFormula(), for instance "3.14159", "Text Value", "=SUM(A1:B5)"
- The value is the numeric value (zero for text values, errors or blanks), c.s.s.table.XCell.getValue()
- The string is the displayed text (literal text value, formatted number or error-string), c.s.s.text.XTextRange.getString()
- Property c.s.s.sheet.SheetCell.FormulaLocal reflects the localized formula string as it is currently used in the formula bar: "3,14159" (comma), "Text Value", "=SUMME(A1:B5)" (German SUMME)
- Nested sequence (list of lists) of all formulas in a range: []string c.s.s.sheet.XCellRangeFormula.getFormulaArray()
- Nested sequence (list of lists) of all values in a range: c.s.s.sheet.XCellRangeData.getDataArray() [string, double, or Null for errors]
- The array formula of a range, for instance "{=TRANSPOSE(A1:D4)}" through c.s.s.sheet.XArrayFormulaRange.getArrayFormula(). Unfortunately, this interface has a bug. It returns the localized array-formula, for instance German "{=MTRANS(A1:D4)}". You get the correct English notation from c.s.s.table.XCell.getFormula() which is by definition the same for all cells in the array. Nother strange issue: c.s.s.sheet.XArrayFormulaRange.setArrayFormula("TRANSPOSE(A1:D4)") [English without "{=}"]
This is where things become complicated3. Dates & times appear to be represented as doubles in the same way as lotus & excel
* I grabbed the code from Jakarta POI to convert the double to a date.
* Is there an OOo equivalent?
Calc's day zero is 1899-12-30 by default. This is a per-document setting. The GUI offers 2 common alternatives 1900-01-01 (ISO-SQL) and 1904-01-01 (Quattro?). In fact this date can be any other day (founding day of my company? my birhtday?). c.s.s.NumberFormatSettings.NullDate
Excel's default is none of these. Excel's day zero defaults to 1899-12-31, 1900-01-01 is day #1. Due to the infamous leap-year bug in Excel, this one-day offset makes a difference for the first 60 days until 1900-02-28. From 1900-03-01 (day #61) both Excel and Calc map the same numbers to the same dates when using their respective defaults. See http://user.services.openoffice.org/en/ ... 490#p14490 with Excel document attached. For string conversion there is another relevant parameter: c.s.s.NumberFormatSettings.TwoDigitDateStart
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Re: Reading contents of spreadsheet - best practice
odftoolkit does look like what I want and I did find it in my searching, but it doesn't look ready yet which is a shame.Villeroy wrote:However, there is a promising project related to ODF: http://odftoolkit.openoffice.org/. Unfortunately, it has been promising for quite a long time now. Personally I believe that Microsoft is going to win another format race.
I was aware of the excel leap year bug, not aware of the complexities of date handling in OpenOffice.Villeroy wrote:This is where things become complicated3. Dates & times appear to be represented as doubles in the same way as lotus & excel
* I grabbed the code from Jakarta POI to convert the double to a date.
* Is there an OOo equivalent?
Calc's day zero is 1899-12-30 by default. This is a per-document setting. The GUI offers 2 common alternatives 1900-01-01 (ISO-SQL) and 1904-01-01 (Quattro?). In fact this date can be any other day (founding day of my company? my birhtday?). c.s.s.NumberFormatSettings.NullDate
Excel's default is none of these. Excel's day zero defaults to 1899-12-31, 1900-01-01 is day #1. Due to the infamous leap-year bug in Excel, this one-day offset makes a difference for the first 60 days until 1900-02-28. From 1900-03-01 (day #61) both Excel and Calc map the same numbers to the same dates when using their respective defaults. See http://user.services.openoffice.org/en/ ... 490#p14490 with Excel document attached. For string conversion there is another relevant parameter: c.s.s.NumberFormatSettings.TwoDigitDateStart
Would the best approach be to use the c.s.s.text.XTextRange.getString() method and use java to parse the date string?
Alternatively I could look at changing the getJavaDate method in my code to accept a starting date based on c.s.s.NumberFormatSettings.NullDate but that sounds messy. Surely I cannot be the first person who wants to do this?
Re: Reading contents of spreadsheet - best practice
Possibly you are the first person who really caresSurely I cannot be the first person who wants to do this?
Things become really complicated when you need to process any spreadsheet thrown at your program. Same issues in any spreadsheet, nothing Calc specific. IMHO the main problem is that spreadsheets accept any value in all the cells, be it text, number, empty or error.
I would not recommend date conversion by strings. "YY||DD@MMM" could be a userdefined date format in any locale (foreign names of days and months plus two-digit year) or even "On a glory "dddd" in the year of the Lord "YYYY ("on a glory Sunday in the year of the Lord 1999") is a valid number format, representing a numeric value as date. The numeric conversion with known day zero is more reliable. Why not use the spreadsheet itself?
This is how one could try to handle the pitfalls in Python.
Code: Select all
[...]
class FunctionProvider(XFunctionAccess, unohelper.Base):
def __init__(self, smgr):
self.FunctionAccess = smgr.createInstance('com.sun.star.sheet.FunctionAccess')
def setProperties(self, **dArgs):
for p in dArgs.keys():
self.FunctionAccess.setPropertyValue(p, dArgs[p])
def setPropertiesByInfo(self, propInfo):
"""Sync FunctionAccess service with a document's PropertySetInfo."""
for p in self.FunctionAccess.PropertySetInfo.getProperties():
s = p.Name
if propInfo.hasPropertyByName(s):
v = propInfo.getPropertyByName(s)
self.FunctionAccess.setPropertyValue(s, v)
def callFunction(self, name, *args):
return self.FunctionAccess.callFunction(name, *args)
[...]
# Adjust all FunctionProvider's properties to the corresponding properties of the respective document.
FP = FunctionProvider(Office.smgr,)
FP.setPropertiesByInfo(myDoc.getPropertySetInfo())
Code: Select all
# as above:
FP = FunctionProvider(Office.smgr,)
FP.setPropertiesByInfo(myDoc.getPropertySetInfo())
# getting an array of ISO-date strings from all values in A2:A1000
oRange = oSheet.getCellRangeByPosition(0,1,0,999)
aData = oRange.getDataArray()
aISOStrings = FP.callFunction("TEXT", (aData, "YYYY-MM-DD"))
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Re: Reading contents of spreadsheet - best practice
Aww, I thank you for your answer, but seems I can't implement it : (
I am so iliterate about all this.
Could you please give me syntax to for example print $A$2:$D$46 from current sheeet into c:/bla.txt ?
dim args3(0) as new com.sun.star.beans.PropertyValue
args3(0).Name = "ToPoint"
args3(0).Value = "$A$2"
args3(0).Value = "$A$2:$D$46"
dispatcher.executeDispatch(document, ".uno:GoToCell", "", 0, args3())
dim args4(0) as new com.sun.star.beans.PropertyValue
args4(0).Name = "ToPoint"
args4(0).Value = "$A$2:$D$46"
dispatcher.executeDispatch(document, ".uno:GoToCell", "", 0, args4())
dispatcher.executeDispatch(document, ".uno:Copy", "", 0, Array())
rem ----------------------------------------------------------------------
Dim FileNo As Integer
Dim Filename As String
Filename = "c:/bla.txt"
FileNo = Freefile
Open Filename For Output As #FileNo
Print #FileNo, <<< what should I put here so all content of $A$2:$D$46 is saved in txt file
Close #FileNo
Thank you
I am so iliterate about all this.
Could you please give me syntax to for example print $A$2:$D$46 from current sheeet into c:/bla.txt ?
dim args3(0) as new com.sun.star.beans.PropertyValue
args3(0).Name = "ToPoint"
args3(0).Value = "$A$2"
args3(0).Value = "$A$2:$D$46"
dispatcher.executeDispatch(document, ".uno:GoToCell", "", 0, args3())
dim args4(0) as new com.sun.star.beans.PropertyValue
args4(0).Name = "ToPoint"
args4(0).Value = "$A$2:$D$46"
dispatcher.executeDispatch(document, ".uno:GoToCell", "", 0, args4())
dispatcher.executeDispatch(document, ".uno:Copy", "", 0, Array())
rem ----------------------------------------------------------------------
Dim FileNo As Integer
Dim Filename As String
Filename = "c:/bla.txt"
FileNo = Freefile
Open Filename For Output As #FileNo
Print #FileNo, <<< what should I put here so all content of $A$2:$D$46 is saved in txt file
Close #FileNo
Thank you
OOo 2.4.X on Ms Windows XP