[Solved] Unable to export a document to html

Java, C++, C#, Delphi... - Using the UNO bridges
Post Reply
Amunike
Posts: 8
Joined: Tue Apr 28, 2009 9:46 pm

[Solved] Unable to export a document to html

Post by Amunike »

Hello Guys,

I'm trying to programatically export a word document to html format using OpenOffice 3 but I haven't had any luck so far. My computer's operating system is Windows XP.
I have tried the following filter with disastrous results: 'HTML', 'HTML (StarWriter)' and 'impress_html_Export'.

To give you an idea of the type of result I have been getting, here is a sample of the output file:

PK�����”šœ:^Æ2f'���'������mimetypeapplication/vnd.oasis.opendocument.textPK�����”šœ:���������������Configurations2/statusbar/PK�����”šœ:���������������Configurations2/floater/PK�����”šœ:���������������Configurations2/popupmenu/PK�����”šœ:���������������Configurations2/progressbar/PK�����”šœ:���������������Configurations2/menubar/PK�����”šœ:���������������Configurations2/toolbar/PK�����”šœ:���������������Configurations2/images/Bitmaps/PK���”šœ:������������-���Pictures/200000B7000089FA0000AB416FBC0CF2.wmf|œuÜVÅó°wföÜtKww§ H‰€ˆ„ (ŠÒÝÝÝÝÒ-Ý-‚ÒRJ‡(ïuß‚ïï?_ŸÏå9gsvvfvöÍ9+X†¹ÑÿS;‡ á·.óár‘ÈÓK¬È3ôú[E#}D_ñO¸$—$qQyþ¥áòþ —çç¢ð|Á˜K6ü·T\.ž


My actual code is as follow:

Code: Select all

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using unoidl.com.sun.star.lang;
using unoidl.com.sun.star.uno;
using unoidl.com.sun.star.bridge;
using unoidl.com.sun.star.frame;
using Microsoft.Win32;
using System.Runtime.InteropServices;
using System.IO;
using System.Diagnostics;

namespace WebApplicationOpenOff
{
    public partial class _Default : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
            InitOpenOfficeEnvironment();
        }

        protected void Button1_Click(object sender, EventArgs e)
        {

            if (StartOpenOffice())
            {
                //Get a ComponentContext
                unoidl.com.sun.star.uno.XComponentContext xLocalContext =
                   uno.util.Bootstrap.bootstrap();
                //Get MultiServiceFactory
                unoidl.com.sun.star.lang.XMultiServiceFactory xRemoteFactory =
                   (unoidl.com.sun.star.lang.XMultiServiceFactory)
                   xLocalContext.getServiceManager();
                //Get a CompontLoader
                XComponentLoader aLoader =
                   (XComponentLoader)xRemoteFactory.createInstance("com.sun.star.frame.Desktop");
                //Load the sourcefile
                XComponent xComponent = initDocument(aLoader,
                   PathConverter("C:\\Documents and Settings\\Soumah\\Desktop\\Test-Word-Doc.doc"), "_blank");
                //Wait for loading
                while (xComponent == null)
                {
                    System.Threading.Thread.Sleep(1000);
                }
                saveDocument(xComponent, PathConverter("C:\\Documents and Settings\\Soumah\\Desktop\\Test-Word-Doc.html"));
                //Wait for input
                Console.WriteLine("Conversation completed!");
            }

        }


        private void InitOpenOfficeEnvironment()
        {
            string baseKey;
            // OpenOffice being a 32 bit app, its registry location is different in a 64 bit OS  
            if (Marshal.SizeOf(typeof(IntPtr)) == 8)
                baseKey = @"SOFTWARE\Wow6432Node\OpenOffice.org\";
            else
                baseKey = @"SOFTWARE\OpenOffice.org\";

            // Get the URE directory  
            string key = baseKey + @"Layers\URE\1";
            RegistryKey reg = Registry.CurrentUser.OpenSubKey(key);
            if (reg == null) reg = Registry.LocalMachine.OpenSubKey(key);
            string urePath = reg.GetValue("UREINSTALLLOCATION") as string;
            reg.Close();
            urePath = Path.Combine(urePath, "bin");

            // Get the UNO Path  
            key = baseKey + @"UNO\InstallPath";
            reg = Registry.CurrentUser.OpenSubKey(key);
            if (reg == null) reg = Registry.LocalMachine.OpenSubKey(key);
            string unoPath = reg.GetValue(null) as string;
            reg.Close();

            string path;
            path = string.Format("{0};{1}", System.Environment.GetEnvironmentVariable("PATH"), urePath);
            System.Environment.SetEnvironmentVariable("PATH", path);
            System.Environment.SetEnvironmentVariable("UNO_PATH", unoPath);
        }

        /// <summary>
        /// Load a given file or create a new blank file
        /// </summary>
        /// <param name="aLoader">A ComponentLoader</param>
        /// <param name="file">The file</param>
        /// <param name="target">The target</param>
        /// <returns>Th Component</returns>
        static XComponent initDocument(
           XComponentLoader aLoader, string file, string target
           )
        {
            XComponent xComponent = aLoader.loadComponentFromURL(
               file, target, 0,
               new unoidl.com.sun.star.beans.PropertyValue[0]);

            return xComponent;
        }

        /// <summary>
        /// Saves the document.
        /// </summary>
        /// <param name="xComponent">The x component.</param>
        /// <param name="fileName">Name of the file.</param>
        static void saveDocument(XComponent xComponent, string fileName)
        {
            unoidl.com.sun.star.beans.PropertyValue[] propertyValue =
               new unoidl.com.sun.star.beans.PropertyValue[1];

            propertyValue[0] = new unoidl.com.sun.star.beans.PropertyValue();
            propertyValue[0].Name = "Filter";
            propertyValue[0].Value = new uno.Any("writer_html_Export");

            ((XStorable)xComponent).storeToURL(fileName, propertyValue);
        }

        /// <summary>
        /// Convert into OO file format
        /// </summary>
        /// <param name="file">The file.</param>
        /// <returns>The converted file</returns>
        private static string PathConverter(string file)
        {
            try
            {
                file = file.Replace(@"\", "/");

                return "file:///" + file;
            }
            catch (System.Exception ex)
            {
                throw ex;
            }
        }

        

        /// <summary>
        /// Starts the open office.
        /// </summary>
        /// <returns></returns>
        private static bool StartOpenOffice()
        {
            Process[] ps = Process.GetProcessesByName("soffice.exe");
            if (ps != null)
            {
                if (ps.Length > 0)
                    return true;
                else
                {
                    Process p = Process.Start("soffice.exe");
                    //spent some time to start
                    System.Threading.Thread.Sleep(3000);
                }
            }
            return true;
        }




    }
}

Please help me spot the problem.

Thank you in advance.


-Amunike
Last edited by Hagar Delest on Wed Apr 29, 2009 5:07 pm, edited 3 times in total.
Reason: tagged [Solved].
OOo 3.0.X on Ms Windows XP
User avatar
Villeroy
Volunteer
Posts: 31363
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: unable to export a document to html

Post by Villeroy »

Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Amunike
Posts: 8
Joined: Tue Apr 28, 2009 9:46 pm

Re: unable to export a document to html

Post by Amunike »

I forgot to add a detail:

The file "TypeDetection.xml" is nowhere to be found in my installation. I've read on the web that I'm supposed to have it but it wasn't included in my brand new installation of OpenOffice 3.
OOo 3.0.X on Ms Windows XP
User avatar
Villeroy
Volunteer
Posts: 31363
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Unable to export a document to html

Post by Villeroy »

Another converter in Java: http://www.artofsolving.com/opensource/jodconverter

Moderation: moved from "General" to "External Programs"
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Amunike
Posts: 8
Joined: Tue Apr 28, 2009 9:46 pm

Re: Unable to export a document to html

Post by Amunike »

Thanks for the tips villeroy. However i am not trying to buy a solution, I am trying to implement one.
OOo 3.0.X on Ms Windows XP
User avatar
Villeroy
Volunteer
Posts: 31363
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Unable to export a document to html

Post by Villeroy »

Somebody has done this job already. No need to buy anything. Just download and use it.
http://sourceforge.net/project/showfile ... p_id=91849

http://sourceforge.net/project/shownote ... _id=675054
Licenses
========

JODConverter is distributed under the terms of the LGPL.

This basically means that you are free to use it in both open source
and commercial projects.

If you modify the library itself you are required to contribute
your changes back, so JODConverter can be improved.

(You are free to modify the sample webapp as a starting point for your
own webapp without restrictions.)

JODConverter includes various third-party libraries so you must
agree to their respective licenses - included in docs/third-party-licenses.

That may include software developed by

* the Apache Software Foundation (http://www.apache.org)
* the Spring Framework project (http://www.springframework.org)
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Amunike
Posts: 8
Joined: Tue Apr 28, 2009 9:46 pm

Re: Unable to export a document to html

Post by Amunike »

Thanks for all your replies villeroy. I'm really looking to fix my code...a thirs party solution is not inline with my project requierements. I would really appreciate if you had any idea of how I can fix my code.....

-Amunike
OOo 3.0.X on Ms Windows XP
User avatar
Villeroy
Volunteer
Posts: 31363
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Unable to export a document to html

Post by Villeroy »

Debug? Read other people's code? The current output looks like a typical Writer document, which is a zip archive containing XML mainly.

Whatever language that is, are you shure that ...

Code: Select all

new unoidl.com.sun.star.beans.PropertyValue[1];
... initializes an array of exactly one property value?

Does your project requirements really rely on a 300MB office suite to convert doc2html? "doc2html" throws thousands of google results unrelated to OOo.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
mnasato
Posts: 4
Joined: Tue Apr 28, 2009 3:29 pm
Location: London, U.K.

Re: Unable to export a document to html

Post by mnasato »

Amunike wrote:

Code: Select all

            propertyValue[0].Name = "Filter";
That should be "FilterName".
OOo 3.0.X on Ubuntu 8.x
Amunike
Posts: 8
Joined: Tue Apr 28, 2009 9:46 pm

Re: Unable to export a document to html

Post by Amunike »

Hi guys,

Thanks for your help. I managed to solve my problem with this:

http://user.services.openoffice.org/en/ ... 20&t=12371

I was basically using the wrong filter name.

Regards,
Amunike.
OOo 3.0.X on Ms Windows XP
Post Reply