I need to extract a OO writer document Properties values generically to export them in an xml file . I don't think that calling the java methods of http://www.openoffice.org/api/docs/comm ... erties.htm one by one is a good approach. the obstacle is that each value of these properties has a different Tye. for Example "Author" type is a 4string' while DocumentStatistics is a com::sun::beans::NamedValue .
i have the below code that extracts only the Names of the document properties. i need an equivalent code to get the proper value in a String formet tpo be able to export it in my xml file. can you propose a way to iterate and get the Document properties values ?
XDocumentproperties.storeToStorage() or storeToMedium()
then for XStorage specify an xml file? would this do it ? can you provide some java code to export the writer document properties into an xml file?
is there some method exportProperties() equivalent to this importer : http://www.openoffice.org/api/docs/comm ... orter.html
thanks
OOo 3.4 and LibreOffice 3.4 on openSuse 11.4 + windows 7
public String propertyValueToString(Object propValue) throws IllegalArgumentException {
String res = null;
SimpleDateFormat sdf = new SimpleDateFormat("MMM dd,yyyy HH:mm");
if (AnyConverter.isDouble(propValue)) {
res = Double.toString(AnyConverter.toDouble(propValue));
} else if (AnyConverter.isFloat(propValue)) {
res = Float.toString(AnyConverter.toFloat(propValue));
} else if (AnyConverter.isInt(propValue)) {
res = Integer.toString(AnyConverter.toInt(propValue));
} else if (AnyConverter.isLong(propValue)) {
res = Long.toString(AnyConverter.toLong(propValue));
} else if (AnyConverter.isShort(propValue)) {
res = Short.toString(AnyConverter.toShort(propValue));
} else if (AnyConverter.isBoolean(propValue)) {
res = Boolean.toString(AnyConverter.toBoolean(propValue));
} else if (AnyConverter.isString(propValue)) {
res = AnyConverter.toString(propValue);
} else if (propValue instanceof Date) {
// System.out.println("got a Date!");
Date dt = (Date) (propValue);
Calendar cal = new GregorianCalendar(dt.Year, dt.Month, dt.Day);
res = sdf.format(cal.getTime());
}
else if (propValue instanceof DateTime) {
// System.out.println("got a Date!");
DateTime dt = (DateTime) (propValue);
Calendar cal = new GregorianCalendar(dt.Year, dt.Month, dt.Day);
res = sdf.format(cal.getTime());
}
return res;
}
OOo 3.4 and LibreOffice 3.4 on openSuse 11.4 + windows 7
What about extracting the meta.xml file from the .odt file directly? I've done this sometimes with Python and the zipfile module. I am pretty sure there is something similar in Java. And if not you can use the "com.sun.star.packages.Package" UNO-service.
The drawback might be that this method works on files (on the disk) and not on data blocks in memory. And you will probably have to run the meta.xml through an XSLT process to convert it to something that fits your needs.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
rudolfo wrote:What about extracting the meta.xml file from the .odt file directly? I've done this sometimes with Python and the zipfile module. I am pretty sure there is something similar in Java. And if not you can use the "com.sun.star.packages.Package" UNO-service.
The drawback might be that this method works on files (on the disk) and not on data blocks in memory. And you will probably have to run the meta.xml through an XSLT process to convert it to something that fits your needs.
Hmm . i have wrote a java OO utility to introspect the oxt binary ( see my blog post: https://othmanelmoulatblog.wordpress.co ... with-java/)
but this java class only parses and extracts data from the binary oxt not from the current writer document opened in openoffice application. so i need another class that extracts data form .odt not from oxt. I also know that the odf toolkit allows us to do this kind of work. if you can present some java (or python) snippet code that extracts data from current .odt that would be very welcome.
thanks
OOo 3.4 and LibreOffice 3.4 on openSuse 11.4 + windows 7
in the method arguments. the fileURL is the full url of where you want your xml file to be stored.
this is a good method. but it is not perfect for me as my xml needs to have some additional properties appended to document properties. so i have to parse again the exported file and append elements to xml then export again which is not very clean.
However if you want only to export your document props as xml the XMLTools class above is a very good solution
OOo 3.4 and LibreOffice 3.4 on openSuse 11.4 + windows 7
import zipfile
zip = zipfile.ZipFile(file, 'r', zipfile.ZIP_DEFLATED)
zipEntry = zip.getinfo('meta.xml')
metaData = zip.read(zipEntry)
# metaData is the XML as a stream of bytes
# If we want to inspect it a bit closer, we use Python's XML modules
import xml.dom.minidom
from xml.dom.minidom import parseString
xmlMeta = parseString(metaData)
metaRoot = xmlMeta.documentElement.getElementsByTagName('office:meta')[0]
# Loop through the child notes of "office:meta"
# And place either the child element and its first text node or the node and its
# attributes into two dictionaries
counters = {}
metas = {}
for aNode in metaRoot.childNodes :
if aNode.localName == 'document-statistic' :
for i in range(aNode.attributes.length) :
counters[aNode.attributes.item(i).localName] = aNode.attributes.item(i).value
elif aNode.hasChildNodes() :
metas[aNode.localName] = aNode.firstChild.nodeValue
That's plain python with a few modules from its standard library. No OpenOffice UNO interfaces. Hence it might be different in Java, although I am pretty sure that the DOM parsing code is probably the same getChildNodes and getElementsByName should be available in any language, because it is rather DOM specific and not programming language specific.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
import zipfile
zip = zipfile.ZipFile(file, 'r', zipfile.ZIP_DEFLATED)
zipEntry = zip.getinfo('meta.xml')
metaData = zip.read(zipEntry)
# metaData is the XML as a stream of bytes
# If we want to inspect it a bit closer, we use Python's XML modules
import xml.dom.minidom
from xml.dom.minidom import parseString
xmlMeta = parseString(metaData)
metaRoot = xmlMeta.documentElement.getElementsByTagName('office:meta')[0]
# Loop through the child notes of "office:meta"
# And place either the child element and its first text node or the node and its
# attributes into two dictionaries
counters = {}
metas = {}
for aNode in metaRoot.childNodes :
if aNode.localName == 'document-statistic' :
for i in range(aNode.attributes.length) :
counters[aNode.attributes.item(i).localName] = aNode.attributes.item(i).value
elif aNode.hasChildNodes() :
metas[aNode.localName] = aNode.firstChild.nodeValue
That's plain python with a few modules from its standard library. No OpenOffice UNO interfaces. Hence it might be different in Java, although I am pretty sure that the DOM parsing code is probably the same getChildNodes and getElementsByName should be available in any language, because it is rather DOM specific and not programming language specific.
it is probably same logic i did for the oxt java class mentioned above. not a clean solution but worth a test.
OOo 3.4 and LibreOffice 3.4 on openSuse 11.4 + windows 7