[Solved] RTF to HTML conversion loses centering attribute

Writing a book, Automating Document Production - Discuss your special needs here
Post Reply
Woody20
Posts: 4
Joined: Fri Jan 13, 2012 11:30 pm

[Solved] RTF to HTML conversion loses centering attribute

Post by Woody20 »

I'm trying to use OO to convert RTF input (from MS Word 2000) to HTML.
If I open the RTF file, everything looks exactly correct on the screen.
If I save it as HTML, then re-open the HTML file in OO, almost everything looks the same (exception: table).
However, if I open the HTML file in Firefox, the text is not correct. Specifically, the paragraphs that were centered or right-justified in the RTF and when viewed in OO HTML are now all left-justified.

This is strange, because the text of the HTML file is

Code: Select all

<P CLASS="western" ALIGN=CENTER STYLE="text-indent: 0in; margin-bottom: 0in">
<FONT COLOR="#000000"><FONT FACE="Verdana, sans-serif"><FONT SIZE=4 STYLE="font-size: 16pt"><B>Some text that should be centered</B></FONT></FONT></FONT></P>
and the class "western" is

Code: Select all

P.western { font-size: 10pt; so-language: en-US }
Anybody know why the centering is not working as expected? I will deal with the table problems another day.
Last edited by Woody20 on Sun Jan 15, 2012 10:20 pm, edited 2 times in total.
OpenOffice 3.3 on Windows XP
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: RTF to HTML conversion loses centering attribute

Post by Villeroy »

Running MS Windows, you have an application called "WordPad" which supports RTF natively.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
rudolfo
Volunteer
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

Re: RTF to HTML conversion loses centering attribute

Post by rudolfo »

When exporting to HTML OpenOffice allows you different flavors: HTML 3.2, Netscape Navigator, Internet Explorer and Writer. Sometimes it helps to test the different flavors when exporting. Though this dialog sucks, because the only thing that is clearly defined is HTML 3.2, the old W3C standard, the other 3 can be anything, IE4 is completely different from IE8. But the dialog doesn't tell of which version of IE it thinks of. But that's just another piece of the overall rule: Don't use OOo for RTF, and don't use it for HTML.

Still don't give up on OOo. The XHTML export is really good, at least if you view it in Firefox. Not sure if this is true for a document that was imported from RTF format, but for native .odt documents it is really amazingly close to the original.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
Woody20
Posts: 4
Joined: Fri Jan 13, 2012 11:30 pm

Re: RTF to HTML conversion loses centering attribute

Post by Woody20 »

How do I choose which HTML or XHTML standard I want? When I did "save as", there was only a single HTML choice.
OpenOffice 3.3 on Windows XP
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: RTF to HTML conversion loses centering attribute

Post by Villeroy »

Try menu:File>Export
Exportable file formats are write-only. OOo can convert to these file format but it can not open them.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Woody20
Posts: 4
Joined: Fri Jan 13, 2012 11:30 pm

Re: RTF to HTML conversion loses centering attribute

Post by Woody20 »

File/Export gives me two choices: xhtml and PDF. If I choose xhtml I get a message that the selected Java runtime environment is defective.

Not sure what you mean by "write-only". If the file format is supported by OO is should be readable by OO. However, this is not important for the current question, just curious.
OpenOffice 3.3 on Windows XP
User avatar
Villeroy
Volunteer
Posts: 31269
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: RTF to HTML conversion loses centering attribute

Post by Villeroy »

I did not know that this tool requires Java.
Have a look at Tools>Options>Java and see if you can choose an auto-detected installation of Java or if you can point the office to an existing installation of Java.
A file format becomes write-only when it is possible to map the your own attributes to the other file format attributes while the other way round would be very difficult, incomplete or impossible.
In the case of PDF it is obvious that the exported file has nothing to do with an office document.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
rudolfo
Volunteer
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

Re: RTF to HTML conversion loses centering attribute

Post by rudolfo »

As a background information it is worth knowing that a great part of the export functionality in OOo is achieved through XSLT (eXtensible Stylesheet Language Transformation). There are quite a lot of transformation engines available: Microsoft has one, the relevant scripting languages (Perl,PHP,Python) include them as modules and one of the oldest and very stable processors is Xalan from the apache project. OOo uses the java implementation of Xalan, that's the reason why a working Java Runtime is required to export to XHTML.
In the usual export dialog PDF and XHTML seems to be on the same level, but PDF runs internally while XHMTL requires the JRE. If you want to know this in advance which export options needs Java there is a way to figure this out: In Tools -> XML Filter Settings you will see the currently installed and available filters. In my case:
DocBook file OpenOffice.org Writer (.sxw) import/export filter
MS Word 2003 OpenOffice.org Writer (.odt) import/export filter
UOF text OpenOffice.org Writer (.odt) import/export filter
XHTML Writer File OpenOffice.org Writer (.odt) export filter

All what you find in this Filter Settings dialog requires the Java Runtime. If it's an export and import filter you will find it under "Save As", if it is only an export filter File -> Export is the place to look for it. All the 4 listed formats are XML based, that makes it easy to simply run another XSL Transformation in the opposite direction to get the original .odt file back. "original" depends on how good the attributes of the two formats can be mapped on each other. For XHTML someone thought that's too limited to classify this as an import, as well. Or it was just to complicated or time consuming to write an appropriate import XSL file.

I don't know about the other formats above, but for DocBook I can say that it is used as a meta format in the OpenSource community to generate HTML, Tex/pdf and of course also plain Text formats. It is a logical markup language only. You can't view it directly (except for in its XML source code). If you want to visualize it, you will require one of the just mentioned transformators.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
rudolfo
Volunteer
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

Re: RTF to HTML conversion loses centering attribute

Post by rudolfo »

I think you still need the answer where to set the different HTML flavors. It is in Tools -> Options -> Load/Save - HTML compatibility.
But note that this setting has nothing to do with the above mentioned XML filters and XHTML. It is only relevant for the internal (and somehow ancient) conversion to HTML. But I am pretty sure you will have more luck with XHTML.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
Woody20
Posts: 4
Joined: Fri Jan 13, 2012 11:30 pm

Re: [Solved] RTF to HTML conversion loses centering attribut

Post by Woody20 »

I downloaded Java runtime v 6 from Sun, used the Tools/Options/OpenOffice.org/Java window to add the directory jre6 created by the Java installer, and now when I can open an RTF file with OO and export it as xhtml. The resulting file looks correct in Firefox, including the table.
OpenOffice 3.3 on Windows XP
marcpolizzi
Posts: 2
Joined: Mon Aug 21, 2017 5:17 pm

Re: RTF to HTML conversion loses centering attribute

Post by marcpolizzi »

rudolfo wrote:I think you still need the answer where to set the different HTML flavors. It is in Tools -> Options -> Load/Save - HTML compatibility.
It is only relevant for the internal (and somehow ancient) conversion to HTML.
:D YES it's a very good information, I have "Netscape", I put "HTML 3.2" and all is right :)
The picture are well center in all browser.
Thank's
Marc
Marc
Apache Open Office 4.1.2
marcpolizzi
Posts: 2
Joined: Mon Aug 21, 2017 5:17 pm

Re: [Solved] RTF to HTML conversion loses centering attribut

Post by marcpolizzi »

Hi,

The problem if I put HTLM 3.2 on Aoo Writer for googd center, it's that page-break is missing.
(and to make a ebook it's a problem)

So I return to Netscape or iE or writer Htlm export and
in the html code I search & replace :
replace all ALIGN=CENTER> by style="text-align:center;">
and
replace all ALIGN=CENTER STYLE=" by style="text-align:center;
so all center are OK :D
Marc
Apache Open Office 4.1.2
Post Reply