Custom import - getCellByPosition bottleneck

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
jeremys
Posts: 3
Joined: Thu Aug 30, 2018 2:50 pm

Custom import - getCellByPosition bottleneck

Post by jeremys »

I am currently creating a custom importer/exporter for my own file type with java using XFilter.

I now have a serious bottleneck when importing. Here is my code for inserting my content into a spreadsheet:

Code: Select all

    XSpreadsheetDocument document = UnoRuntime.queryInterface(
        XSpreadsheetDocument.class,
        targetDocument);
    XMultiServiceFactory serviceFactory = UnoRuntime.queryInterface(
        com.sun.star.lang.XMultiServiceFactory.class,
        document);

    for (int i = 0; i < numberOfSpreadsheets; i++) {
        Object spreadsheetObject = serviceFactory.createInstance("com.sun.star.sheet.Spreadsheet");
        XSpreadsheet spreadsheet = UnoRuntime.queryInterface(
            XSpreadsheet.class,
            spreadsheetObject);
        document.getSheets().insertByName(spreadsheetNames.get(i), spreadsheet);
        
        // Not sure about this part
        spreadsheet = UnoRuntime.queryInterface(
            XSpreadsheet.class, 
            document.getSheets().getByName(spreadsheetNames.get(i));

        for (int rowIndex = 0; rowIndex < numberOfRows; rowIndex++) {
            for (int columnIndex = 0; columnIndex < numberOfCellsInRow; columnIndex++) {
                // This call to getCellByPosition seems to be the bottleneck
                XCell cell = spreadsheet.getCellByPosition(columnIndex, rowIndex);

                XText text = UnoRuntime.queryInterface(XText.class, cell);
                text.setString(cellValue);
            }
        }
    }
Is there any way I can omit all these calls to getCellByPosition?
Is there some sort of bulk API available?
OpenOffice 4.1.5 on Windows 10
User avatar
RoryOF
Moderator
Posts: 34612
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: custom import - getCellByPosition bottleneck

Post by RoryOF »

Perhaps try getCellRangeByPosition and do it on a line by line basis.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
jeremys
Posts: 3
Joined: Thu Aug 30, 2018 2:50 pm

Re: custom import - getCellByPosition bottleneck

Post by jeremys »

RoryOF wrote:Perhaps try getCellRangeByPosition and do it on a line by line basis.
This is not any faster.

I also tried setDataArray, which is faster, but does not allow to format the cells. And the formatting afterwards with getCellByPosition still takes a lot of time.
OpenOffice 4.1.5 on Windows 10
jeremys
Posts: 3
Joined: Thu Aug 30, 2018 2:50 pm

Re: Custom import - getCellByPosition bottleneck

Post by jeremys »

I now implemented XExportFilter and XDocumentHandler for the export and it is also very slow.
OpenOffice 4.1.5 on Windows 10
User avatar
Zizi64
Volunteer
Posts: 11359
Joined: Wed May 26, 2010 7:55 am
Location: Budapest, Hungary

Re: Custom import - getCellByPosition bottleneck

Post by Zizi64 »

I also tried setDataArray, which is faster, but does not allow to format the cells. And the formatting afterwards with getCellByPosition still takes a lot of time.
Use the Styles instead of the direct formatting properties. You can preformat the target template file, or you can create/apply cell styles by your program code.
Tibor Kovacs, Hungary; LO7.5.8 /Win7-10 x64Prof.
PortableApps/winPenPack: LO3.3.0-7.6.2;AOO4.1.14
Please, edit the initial post in the topic: add the word [Solved] at the beginning of the subject line - if your problem has been solved.
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Custom import - getCellByPosition bottleneck

Post by Villeroy »

Of course you are not able to apply any kind of number format after setting the cells STRING.
A spreadsheet full of strings is not a spreadsheet anymore. It would be easier to write text files (csv) instead.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Lupp
Volunteer
Posts: 3549
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: Custom import - getCellByPosition bottleneck

Post by Lupp »

(Yes, I accept this evidence that C code is even uglier than Basic code.
I will try to talk about this in simple, human readable terms.)

Whatever you want to do with the cell range
myRg = spreadsheet.getCellRangeByPosition(0, EndColumn, 0, EndRow)
will be done slow if you create the cell objects one by one and then play with them. (In current LibO V6 it's even much slower.)

If there is remedy depends essentially on your answer to the following question:

Do you need to be able to assign content (.Value, .String, or .Formula) and attributes (many; partly best bundled in cell styles) differently for each cell, or are the assignments the same for all the cells in the range (or for subsets of cells described by a reasonably low number of rectangular subranges)?

In the first case you are lost.
In the second case there are ways to do it in a blink of an eye for a few thousand cells. (Least fast for setting a formula I assume.)
On Windows 10: LibreOffice 24.2 (new numbering) and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
Post Reply