Page 1 of 1

[Solved] Severe perf degradation on Linux

Posted: Sat Dec 27, 2014 1:52 am
by _savage
I’m using OO on my Mac, and using it through the Python UNO API works just fine. For example, running a loop like

Code: Select all

>>> parenum = document.Text.createEnumeration()
>>> while parenum.hasMoreElements() :
...     par = parenum.nextElement()
takes about 3s for an average test document. I have then tried to do the same on a Linux server into which I ssh. OO is running headless (just like on my Mac locally) on that Linux server:

Code: Select all

soffice --accept="socket,host=localhost,port=2002;urp;StarOffice.ServiceManager" --headless --invisible &
but the above loop to iterate over all paragraphs takes several minutes to complete. Note: the Python script too runs on that Linux server, so network latency should be out of the equation.

Any ideas as to why this would be so slow? I’ve noticed that even in --headless mode a Writer window opens to display the document on my Mac; I have a slight suspicion that perhaps the performance degradation results from OO trying to interact with X?

Thanks!

Re: Severe perf degradation

Posted: Sat Dec 27, 2014 2:07 am
by Villeroy
Your loop queries object by object over the network.

Re: Severe perf degradation

Posted: Sat Dec 27, 2014 2:12 am
by RusselB
At least part of the problem will be due to the response time from the server, which is affected by the actual (not rated) connection speed.
When running the code on your own system there's no need for the code to try to access an internet connection.
While I don't know all of the possible internet connection speeds, nor the access speed of your hard drive, I highly doubt that your internet connection speed is anywhere close to your hard drive access speed.
Using my numbers, as an example, my hard drive has a rated maximum sustainable transfer rate of 110M/s. In comparison, my internet connection is rated at a maximum of 50M/s. Therefore, comparing based on maximum speeds, something running through the internet is going to take 2.2x the amount of time... and, statistically speaking, it's rare to get a sustainable connection speed at these rates, but I don't have access to average numbers.

Re: Severe perf degradation

Posted: Sat Dec 27, 2014 2:23 am
by _savage
Sorry, I should have been clearer. I ssh to that remote Linux server and run the Python script locally on that remote box. So network latency is out of that equation. (I'll update the question accordingly.)

Re: Severe perf degradation

Posted: Sat Dec 27, 2014 9:35 am
by karolus
_savage wrote: I ssh to that remote Linux server and run the Python script locally on that remote box. So network latency is out of that equation.
....But not the "network latency" triggered by ssh

Re: Severe perf degradation

Posted: Sat Dec 27, 2014 9:40 am
by _savage
karolus wrote:....But not the "network latency" triggered by ssh
I don't follow. I have a standing ssh connection to some remote Linux host. On that Linux host runs OO in headless mode, and on that same host I open a Python interpreter and connect to OO. As far as that Python/OO interaction is concerned, they are both local on the same host; in fact, the Python code connects to "localhost:2002".

How does the network latency of ssh influence that local connection between the Python interpreter and OO?

Re: Severe perf degradation

Posted: Sun Dec 28, 2014 11:19 pm
by _savage
I don't think there is network overhead. Take a look at the attached Python script: it spins up an office instance and iterates over a test document. Because the test document is too large to attach to this post, I've uploaded it here. (It's a document of 5 x 10.000 lorem ipsum words, generated with the lipsum generator.)

Running that on my Mac locally takes about 1.5s. Then I remote into the Linux box and open a screen session, then

Code: Select all

sleep 60 && PYTHONPATH=/usr/lib64/openoffice/program python3.4 ./lorem.py
and disconnect from the Linux box. So while the Python script iterates over the paragraphs of the test document, there is no network connection to begin with. The script runs for 45s!

The next logical step would be profiling, but before I begin digging deeper into this, I wonder if people can reproduce the slowdown.

Note The same slowdown can be measured running this on a local Linux box.

Re: Severe perf degradation on Linux

Posted: Mon Dec 29, 2014 11:55 am
by karolus
Hallo

For me it seems very unusual on actual Linux'es to have a path /usr/lib64/openoffice/program ??

next Question: there is a hardcoded Mac-Path in your lorem.py that you're run on Linux, how should it work ??
The script runs for 45s!
the script runs for 45 sec or is that the duration from your:

Code: Select all

print("Elapsed time: " + str(end_time - start_time))
?


Local Test with your lorem.odt as Current Doc from an IPython notebook-session:
savage_test_run.png
Karolus

Re: Severe perf degradation on Linux

Posted: Mon Dec 29, 2014 12:38 pm
by _savage
karolus wrote:For me it seems very unusual on actual Linux'es to have a path /usr/lib64/openoffice/program ??
That's where things are installed on the Gentoo box. Ubuntu has a different path, I think. But while perhaps unusual, it doesn't really make a difference to the problem itself, or should it? ;-)
karolus wrote:next Question: there is a hardcoded Mac-Path in your lorem.py that you're run on Linux, how should it work ??
My bad, I updated the script. I had accidentally uploaded the unedited file from my Mac; of course, on Linux it just calls "soffice".
karolus wrote:
The script runs for 45s!
the script runs for 45 sec or is that the duration from your:

Code: Select all

print("Elapsed time: " + str(end_time - start_time))
?
It's the output of that print statement, i.e. the loop runs for 45s on my Gentoo server, a Ubuntu native installation, and in a Ubuntu VM on Mac. Very consistently reproducible so far.
karolus wrote:Local Test with your lorem.odt as Current Doc from an IPython notebook-session:
savage_test_run.png
That's what I see on my local Mac as well. What do you mean by IPython notebook, this one?

Karolus, what is your local office install, and out-of-the box or did you build it yourself?

Re: Severe perf degradation on Linux

Posted: Mon Dec 29, 2014 3:23 pm
by karolus
Hallo
What do you mean by IPython notebook, this one?
[1]Yes, it is.

[1]in this case mostly out of the box with the .deb packages from http://www.libreoffice.org/download/libreoffice-fresh/
It installs to /opt/libreoffice4.3/
Only for IPython notebook i need to extend PYTHONPATH in /opt/libreoffice4.3/program/python:
...
PYTHONPATH=$sd_prog:$sd_prog/python-core-3.3.3/lib:$sd_prog/python-core-3.3.3/lib/lib-dynload:$sd_prog/python-core-3.3.3/lib/lib-tk:$sd_prog/python-core-3.3.3/lib/site-packages:$HOME/miniconda3/envs/note/lib/python3.3:$HOME/miniconda3/envs/note/lib/python3.3/lib-dynload:/usr/local/lib/python3.3/dist-packages${PYTHONPATH+:$PYTHONPATH}
...

First i start the notebook via commandline:

Code: Select all

/opt/libreoffice4.3/program/python -m IPython notebook
from notebook i use to start soffice as server:

Code: Select all

from subprocess import Popen

officepath = '/opt/libreoffice4.3/program/soffice'
calc = '--calc'
pipe = "--accept=pipe,name=abraxas;urp;StarOffice.Servicemanager"

Popen([officepath,calc, pipe])
and the connection to office-process with:

Code: Select all

import uno
local = uno.getComponentContext()
resolver = local.ServiceManager.createInstance("com.sun.star.bridge.UnoUrlResolver")

client = resolver.resolve("uno:pipe,"
                           "name=abraxas;"
                           "urp;"
                           "StarOffice.ComponentContext")

createUnoService = client.ServiceManager.createInstance

(desktop,
 file_access,
 pathsubstitution,
 mri,
 pipe,
 textout,
 textin,
 contentfactory) = map(   createUnoService,
            ("com.sun.star.frame.Desktop",
             "com.sun.star.ucb.SimpleFileAccess",
             "com.sun.star.util.PathSubstitution",
             "mytools.Mri",
             "com.sun.star.io.Pipe",
             "com.sun.star.io.TextOutputStream",
             "com.sun.star.io.TextInputStream",
             "com.sun.star.frame.TransientDocumentsDocumentContentFactory")
             )
(Here with some boilerplate, essentially in the last statement is only the desktop

Re: Severe perf degradation on Linux

Posted: Sat Jan 03, 2015 9:26 am
by _savage
Karolus, I tried the LO from the debian packages as you suggested (on Ubuntu) and it still runs 46s. You're on Mint Linux?

Re: Severe perf degradation on Linux

Posted: Sat Jan 03, 2015 9:58 am
by karolus
_savage wrote:Karolus, I tried the LO from the debian packages as you suggested (on Ubuntu) and it still runs 46s. You're on Mint Linux?
Yes,as you could see in my signature, but it doesnt matter which desktop youre run, the .deb-Packages from LO come with all dependencies.
My guess - there is something other which causes this time latency.

Karolus

Re: Severe perf degradation on Linux

Posted: Sat Jan 03, 2015 4:23 pm
by _savage
karolus wrote:My guess - there is something other which causes this time latency.Karolus
Do you see the slowdown by running just the default python, not using IPy Notebook? I mean, if you run

Code: Select all

PYTHONPATH=/path/to/office/program python ./lorem.py
then how long does it take for that to complete? If I try this on a clean Mint Linux 17 ISO then I get the 45 sec instead of your 2 sec:

Code: Select all

mint@mint ~/dev $ PYTHONPATH=/usr/lib/libreoffice/program python3.4 ./lorem.py 
Spawning office instance pid=2901
Trying to connect to office socket.
Connected to office socket, closing test connection.
Elapsed time: 44.98345136642456
Other than the IPy Notebook, what's different on your machine?

Re: Severe perf degradation on Linux

Posted: Sat Jan 03, 2015 8:28 pm
by karolus
Ok, I've testet with your simplified lorem.py:

Code: Select all

import socket
import errno
import subprocess
import time
import uno
import os




def main():
    #soffice = 'soffice'
    soffice = '/opt/libreoffice4.3/program/soffice'
    p = subprocess.Popen([ soffice,
                          '--accept='
                          #"socket,host=localhost,port=2002"
                          "pipe,name=abraxas"
                          ";urp;StarOffice.Service.Manager",
                          "--headless"])

    print("Spawning office instance pid=" + str(p.pid))

    time.sleep(4)
    
    local = uno.getComponentContext()
    resolver = local.ServiceManager.createInstance("com.sun.star.bridge.UnoUrlResolver")
    context = resolver.resolve(
                               #"uno:socket,host=localhost,port=2002;"
                               "uno:pipe,name=abraxas;"
                               "urp;StarOffice.ComponentContext")
    
    loremurl = "file://%s" %( os.path.expanduser('~/lorem.odt'))
    desktop = context.ServiceManager.createInstance("com.sun.star.frame.Desktop")
    document = desktop.loadComponentFromURL( loremurl, "_blank", 0, ())

    start_time = time.time()

    parenum = document.Text.createEnumeration()
    while parenum.hasMoreElements() :
        par = parenum.nextElement()

    end_time = time.time()
    print("Elapsed time: %0.2f" %(end_time - start_time))

    p.terminate()
    
if __name__ == "__main__" :
    main()
 

Code: Select all

...@... ~ $ /opt/libreoffice4.3/program/python lorem.py
Spawning office instance pid=2754
Elapsed time: 3.88
# the Version with pipe 



...@... ~ $ /opt/libreoffice4.3/program/python lorem.py
Spawning office instance pid=2874
Elapsed time: 50.38
# and the Version with socket instead of pipe
 
Now I'm watch the same Timedifferences from inside IPython notebook-session if starting soffice-server in pipe-mode versus socket-mode.

Karolus

Re: Severe perf degradation on Linux

Posted: Sun Jan 04, 2015 3:07 am
by _savage
Karolus, I think you've pointed out something very critical here! Socket mode versus pipe mode, and it makes a huge difference. Perhaps RusselB and Villeroy did have the right hint afterall?!

On the remote Gentoo Linux server
Pipe mode: Elapsed time: 0.5140466690063477 with 550 pars
Socket mode: Elapsed time: 45.658066272735596 with 550 pars

On the local Ubuntu Linux machine
Pipe mode: Elapsed time: 0.8761096000671387 with 550 pars
Socket mode: Elapsed time: 45.715266942977905 with 550 pars

On my Mac
Pipe mode: Elapsed time: 1.6583800315856934 with 550 pars
Socket mode: Elapsed time: 1.941413164138794 with 550 pars

This doesn't quite solve the problem of why using sockets on Linux is so much slower, but it does solve the larger problem of having an efficient connection from an UNO Py client to the office server.

Re: Severe perf degradation on Linux

Posted: Sun Jan 04, 2015 1:38 pm
by karolus
Hallo

BTW.: Did you make some tests without ssh, on Server something like:

Code: Select all

    soffice = '/opt/libreoffice4.3/program/soffice'
    p = subprocess.Popen([ soffice,
                          '--accept='
                          "socket,host=192.168.2.100,port=2002" #your local netmask?
                          ";urp;StarOffice.Service.Manager",
                          "--headless"])
and from other machine in local net:

Code: Select all

    local = uno.getComponentContext()
    resolver = local.ServiceManager.createInstance("com.sun.star.bridge.UnoUrlResolver")
    context = resolver.resolve(
                               "uno:socket,host=192.168.2.22,port=2002;" #the server address?
                               "urp;StarOffice.ComponentContext")
??

Re: Severe perf degradation on Linux

Posted: Sun Jan 04, 2015 4:41 pm
by _savage
I ran my tests inside of a disconnected screen instance, so no ssh at that point. However, I used "localhost" instead of the IP to address the host. And all tests always run on the same machine, clients never connect to the office instance over a physical network.

Re: Severe perf degradation on Linux

Posted: Sat Jan 10, 2015 3:54 pm
by _savage
After much discussion in different places, I was able to resolve this problem. The hint came from this forum answer. By using the following:

Code: Select all

uno:socket,host=localhost,port=2002,tcpNoDelay=1
I was able to run the test document in mere 0.9s rather than 45s. It seems that the culprit was the default packet routing under Linux, and setting the TCP_NODELAY flag disabled the default packet buffering, thus speeding up the connection.

Re: [Solved] Severe perf degradation on Linux

Posted: Mon Jan 12, 2015 1:24 pm
by _savage