Read each word from DOCX
Posted: Tue Apr 07, 2020 4:27 pm
Hi All,
I am using Java UNO to load Docx document and based on the https://wiki.openoffice.org/wiki/API/Sa ... tStructure reference I am able to read paragraphs but not able to read each word in the paragraph
below is the sample code also attached the sample Docx template for reference
Any help is appreciated , Thanks
I am using Java UNO to load Docx document and based on the https://wiki.openoffice.org/wiki/API/Sa ... tStructure reference I am able to read paragraphs but not able to read each word in the paragraph
Code: Select all
while (xParagraphEnumeration.hasMoreElements()){
XTextContent element = (com.sun.star.text.XTextContent)
UnoRuntime.queryInterface(
com.sun.star.text.XTextContent.class,
xParagraphEnumeration.nextElement());
//XServiceInfo xinfo = (XServiceInfo)test;
XServiceInfo xInfo = UnoRuntime.queryInterface(
XServiceInfo.class, element );
if (xInfo.supportsService ( "com.sun.star.text.TextTable" ) ){
System.out.println(xInfo);
}else{
XEnumerationAccess xParaEnumerationAccess =
(com.sun.star.container.XEnumerationAccess)
UnoRuntime.queryInterface(
com.sun.star.container.XEnumerationAccess.class,
element);
XEnumeration xTextPortionEnum =
xParaEnumerationAccess.createEnumeration();
while (xTextPortionEnum.hasMoreElements())
{
com.sun.star.text.XTextRange xTextRange =
(com.sun.star.text.XTextRange)UnoRuntime.queryInterface(
com.sun.star.text.XTextRange.class,
xTextPortionEnum.nextElement());
// this is returning whole line for exe ("Hello test ${name}") need to get each word Hello , test , ${name}
System.out.println(xTextRange.getString());
}
}