[Solved] Query Writer for supported file types

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
_savage
Posts: 187
Joined: Sun Apr 21, 2013 12:55 am

[Solved] Query Writer for supported file types

Post by _savage »

Is there a way to ask Writer (or any of the other applications) for a list of file types that they are able to load? Ideally a list of mime types?
Last edited by _savage on Thu Mar 31, 2016 9:40 am, edited 2 times in total.
Mac 10.14 using LO 7.2.0.2, Gentoo Linux using LO 7.2.3.2 headless.
User avatar
karolus
Volunteer
Posts: 1158
Joined: Sat Jul 02, 2011 9:47 am

Re: Query Writer for supported file types

Post by karolus »

AOO4, Libreoffice 6.1 on Rasbian OS (on ARM)
Libreoffice 7.4 on Debian 12 (Bookworm) (on RaspberryPI4)
Libreoffice 7.6 flatpak on Debian 12 (Bookworm) (on RaspberryPI4)
_savage
Posts: 187
Joined: Sun Apr 21, 2013 12:55 am

Re: Query Writer for supported file types

Post by _savage »

Great, exactly what I was looking for! I wrote a Python function which returns a dictionary that maps each mime type to a set of supported file extensions.

Code: Select all

import os                                                                       
import lxml.etree                                                               

def get_writer_types():                                                         
    # Return the first element of a list, or None if the list is empty.
    def _first(l):                                                              
        return l[0] if l else None                                              
                                                          
    # Helper function to retrieve the property value for a given property.      
    def _prop(e, pstr):                                                      
        plist = e.xpath("prop[@oor:name='" + pstr + "']/value/text()", namespaces=xml.nsmap)
        return _first(plist)                                                    
                      
    registry_path = "/Applications/LibreOffice.app/Contents/Resources/registry"  
    # linux: "/opt/libreoffice5.0/share/registry/"

    # Load Writer's registry file, and build an lxml tree.                      
    with open(os.path.join(registry_path, "writer.xcd"), "r") as regf:          
        xml = lxml.etree.parse(regf).getroot()                                  
                                                                                
    # Populate this dictionary with the supported mime types.                         
    types = dict()                                                              
                                                                                
    # Walk the 'TypeDetection' subtree to find the supported types.             
    xpath_e = "//oor:component-data[@oor:package='org.openoffice.TypeDetection']/node[@oor:name='Types']/node"
    for e in xml.xpath(xpath_e, namespaces=xml.nsmap):                          
        ext = _prop(e, "Extensions") or "dummy"                              
        mediatype = _prop(e, "MediaType") or "application/x-none"            
        if mediatype not in types:                                              
            types[mediatype] = set()                                            
        types[mediatype].update(ext.split(" "))                                 
     
    # Return the list of supported types.                                       
    return types                                                                
Which returns, for example, a dictionary of mime types to sets like so:

Code: Select all

{'text/plain': {'tsv', 'tab', 'txt', 'csv'}, 'application/msword': {'docx', 'doc', 'dot', 'dotm', 'dotx', 'docm'}, ...}
Note, however, that the dictionary also contains 'application/pdf' (without a 'DetectService' or 'PreferredFilter' property though) which fails to actually load. There are also entries in the original registry file without a mime type (I used 'application/x-none' in that case) or with a 'dummy' file extension.
Mac 10.14 using LO 7.2.0.2, Gentoo Linux using LO 7.2.3.2 headless.
Post Reply