Square boxes in generated PDF

Issues with installing under all GNU/Linux Distributions
Post Reply
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Square boxes in generated PDF

Post by Supriya »

Hi,
I am using openoffice 3.3 to convert docx to pdf in linux redhat server.
docx file contains text in mangal font but in pdf text in mangal font is displayed as square boxes.
Please help me to resolve this issue.

attached is docx and pdf files.
Attachments
uploaded.pdf
Generated pdf file
(16.76 KiB) Downloaded 604 times
Doc_hindi.docx
Docx file
(13.68 KiB) Downloaded 395 times
Last edited by Supriya on Mon Jul 16, 2012 2:32 pm, edited 1 time in total.
User avatar
RoryOF
Moderator
Posts: 34613
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: square boxes in generated PDF

Post by RoryOF »

Use pdf/A option in Export as PDF settings

Note: there is no need to double post - I have deleted your other post on the same problem (RoryOF, Moderator)
.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: square boxes in generated PDF

Post by Supriya »

RoryOF wrote:Use pdf/A option in Export as PDF settings

Note: there is no need to double post - I have deleted your other post on the same problem (RoryOF, Moderator)
.

Still getting same problem :(
Please see the attached Docs as well
open office 3.3 Linux
User avatar
RoryOF
Moderator
Posts: 34613
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: square boxes in generated PDF

Post by RoryOF »

Works perfectly for me using your .docx file. Re-check that you have selected PDF/A-1a in the General tab of the Export as PDF options.

In any event, if you are working in OpenOffice, best to work in OpenOffice format (.odt) and stay away from .docx. If you must share editable files with others using MS Word, best to use .doc as the transfer format, but there may be some formating problems anyway.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
jrkrideau
Volunteer
Posts: 3816
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: square boxes in generated PDF

Post by jrkrideau »

Supriya wrote:
RoryOF wrote:Use pdf/A option in Export as PDF settings

Note: there is no need to double post - I have deleted your other post on the same problem (RoryOF, Moderator)
.

Still getting same problem :(
Please see the attached Docs as well
Seems to work fine for me in LibreOffice 3.5.3.2
LibreOffice 7.3.7. 2; Ubuntu 22.04
User avatar
Hagar Delest
Moderator
Posts: 32658
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: square boxes in generated PDF

Post by Hagar Delest »

Works fine for me on Ubuntu 12.04 with AOO 3.4.1. The font is replaced by another one but it's ok.
Are you sure that you have the font installed on the server? Does the file display correctly on your RH system?
LibreOffice 7.6.2.1 on Xubuntu 23.10 and 7.6.4.1 portable on Windows 10
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: square boxes in generated PDF

Post by Supriya »

Hello,
Yes ,the file opened properly on RH System.
I am using JOD converter to convert docx file to pdf.
Below is code which I have implemented for open office.

Code: Select all

String strOOOHost = "localhost"; 
      String strOOOPort = "8200"; 
      Runtime rt = Runtime.getRuntime()   ; 
       Process process = rt.exec("/opt/openoffice.org3/program/soffice -accept=socket,host=" + strOOOHost + ",port=" + strOOOPort +";urp;")   ; 
       wdComponentAPI.getMessageManager().reportSuccess("stage2");
  
       OpenOfficeConnection openOfficeCOnnection = new SocketOpenOfficeConnection("localhost",8200); 
       wdComponentAPI.getMessageManager().reportSuccess("stage3");
//	  wdComponentAPI.getMessageManager().reportSuccess("stage3");
//	  // connect
       try{
	  openOfficeCOnnection.connect(); 
       }catch (Exception e) {
			// TODO: handle exception
			  
			  e.printStackTrace();
		}
	  wdComponentAPI.getMessageManager().reportSuccess("stage4");
//	  // Get handle to Document convertor 
	  DocumentConverter docConverter = new OpenOfficeDocumentConverter(openOfficeCOnnection); 
	  DefaultDocumentFormatRegistry reg = new DefaultDocumentFormatRegistry(); 
	  wdComponentAPI.getMessageManager().reportSuccess("stage5");
//	  // Convert the source word file to destination pdf file. 
	  docConverter.convert(resourceimg.getContent().getInputStream(), 
			  reg.getFormatByMimeType("application/vnd.openxmlformats-officedocument.wordprocessingml.document"),
			  outStream, 
			  reg.getFormatByMimeType("application/pdf"));
open office 3.3 Linux
User avatar
RoryOF
Moderator
Posts: 34613
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: square boxes in generated PDF

Post by RoryOF »

If, instead of using JOD to do the conversion to PDF, you do it by hand in OpenOffice, does it work correctly then? If so, you need to pass suitable conversion parameters to JOD or set suitable default parameters.
 Edit: There is some discussion and links which were useful at
http://user.services.openoffice.org/en/ ... 45&t=54801 
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi all,
i am stll facing square boxes issue in pdf generation using openoffice 3.3 on linux redhat server.
We have font(Mangal) installed on server and it showing correctly in font name dropdown of openoffice and also when we run fc-list command in linux it shows list of fonts installed on server which includes Mangal font as well.
when i print outputstream of pdf in console what i have observed is mangal font is replaced with Dejavusans.
Please help me to resolve this issue ASAP.
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Square boxes in generated PDF

Post by Robert Tucker »

Code: Select all

$ java -jar lib/jodconverter-cli-2.2.2.jar Doc_hindi.docx Doc_hindi.pdf
Adobe Reader Document Properties Fonts:

Arial-BoldMT
ArialMT
Lohit-Devanagari

Code: Select all

$ java -jar lib/jodconverter-core-3.0-beta-4.jar Doc_hindi.docx Doc_hindi.pdf
Adobe Reader Document Properties Fonts:

Arial-BoldMT
ArialMT
Code2000
DejaVuSans

Simple question: Do you have DejaVuSans installed ubiquitously?

[As far as I know I do not have the Mangal font installed anywhere in Fedora 17, which from the dollar signs you can see I was using.]

Something I've found strange, don't know if it is significant, is that if I copy/paste the boxes from the .pdf that the OP provided into Writer they all get replaced with १(Unicode U+0967), the Hindi number one, I believe, which is the first character on the OP's original .docx document.
LibreOffice 7.x.x on Arch and Fedora.
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi,
Thats what i m saying all contents in mangal font are getting replaced with DejaVuSans(१).
open office 3.3 Linux
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Square boxes in generated PDF

Post by Robert Tucker »

At what run level are you operating the server? If you are only using level 3 will you not be without access to .ttf fonts (which require run level 5 – X server)?
LibreOffice 7.x.x on Arch and Fedora.
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Square boxes in generated PDF

Post by Robert Tucker »

LibreOffice 7.x.x on Arch and Fedora.
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi,
we are operating our server at run level 3.
But we have followed all steps mentioned in specified link while installing font on linux redhat server.
moreover i also want to know whether its a font issue or what?
open office 3.3 Linux
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Square boxes in generated PDF

Post by Robert Tucker »

Switched to run level 3 on Fedora:

Code: Select all

# init 3
and ran:

Code: Select all

$ java -jar lib/jodconverter-core-3.0-beta-4.jar Doc_hindi.docx Doc_hindi.pdf
and it still converted perfectly.

Can you check your Java program/macro is finding all fonts? Something like this perhaps:

http://www.roseindia.net/java/java-get- ... font.shtml

I'm getting seriously out of my depth here, so if someone else can help, please do!

Are you sure about:

Code: Select all

docConverter.convert(resourceimg.getContent().getInputStream(),
Where did you get it from? The only uses of resourceimg I can find on the Web are in association with actual images. Maybe file.getContent().getInputStream() ???????

I presume the server locale is Unicode, for example:

Code: Select all

$ locale
LANG=en_GB.utf8
LC_CTYPE="en_GB.utf8"
LC_NUMERIC=en_GB.utf8
LC_TIME=en_GB.utf8
LC_COLLATE="en_GB.utf8"
LC_MONETARY=en_GB.utf8
LC_MESSAGES="en_GB.utf8"
LC_PAPER="en_GB.utf8"
LC_NAME="en_GB.utf8"
LC_ADDRESS="en_GB.utf8"
LC_TELEPHONE="en_GB.utf8"
LC_MEASUREMENT=en_GB.utf8
LC_IDENTIFICATION="en_GB.utf8"
LC_ALL=
LibreOffice 7.x.x on Arch and Fedora.
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi,
when i run same code on windows OS,its running perfectly. all text is coming properly.
But when i deploy my application on server i m getting this problem.
So i really want to know whether there is some problem in server configuration which is preventing my application from accessing ttf fonts.
open office 3.3 Linux
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Square boxes in generated PDF

Post by Robert Tucker »

And the server locale is?

Since there is a box for each character in the correct position it seems to me the problem is most likely to be the font or the encoding or locale, but, of course, in such situations one has to think of everything and try everything.
LibreOffice 7.x.x on Arch and Fedora.
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi,
server locale :
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
open office 3.3 Linux
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Square boxes in generated PDF

Post by Robert Tucker »

Reading through threads elsewhere I've become aware that you must be using JODConverter 2.* rather than JODConverter 3.* since with the latter you would not need to start OpenOffice, JODConverter 3.* does that itself and you would not be able to use streams:
The convert() method that accept streams in v2.2 just use temporary files internally. So there's no performance gain whatsoever in using streams directly.

In fact, I noticed that many people tried to use streams even when it didn't actually make sense to do so. And that's why I removed it from v3.0, to avoid that sort of confusion.
https://groups.google.com/forum/?fromgr ... %5B1-25%5D

Latest version of JODConverter with latest version of LIbre/OpenOffice very generally recommended.

Are you only having a problem with Hindi documents as reported here:

https://groups.google.com/forum/?fromgr ... %5B1-25%5D

(Seems to be a specifically CentoS problem)
LibreOffice 7.x.x on Arch and Fedora.
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi,
Let me explain you whole scenario.
We have 2 servers ,
Server 1-Linux redhat 5.7,Runlevel 5,mangal.ttf font installed
Server 2-Linux redhat 5.7,Runlevel 3,mangal.ttf font installed

My application is running correctly on Server 1,all text in Arial and mangal font in generated pdf is coming properly
But on server 2 text in Arial font is coming properly and text in mangal font is replaced with square boxes.
open office 3.3 Linux
User avatar
RoryOF
Moderator
Posts: 34613
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Square boxes in generated PDF

Post by RoryOF »

An obvious modification is to advance Server 2 to runlevel 5 and see if that cures the problem.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi,
Do u really think advancing server runlevel will solve my problem.
Is there any relation between server runlevel and ttf font ?
Beacuse server is not in my control and i have to give strong reason to basis person before doing that
open office 3.3 Linux
User avatar
RoryOF
Moderator
Posts: 34613
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Square boxes in generated PDF

Post by RoryOF »

Earlier in this thread, Robert Tucker said
Robert Tucker wrote:At what run level are you operating the server? If you are only using level 3 will you not be without access to .ttf fonts (which require run level 5 – X server)?
He knows more about Red Hat server than I do.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Supriya
Posts: 11
Joined: Mon Jul 16, 2012 11:52 am

Re: Square boxes in generated PDF

Post by Supriya »

hi Robert,
Please suggest or provide me some link so that i can go ahead with advancing server runlevel.
open office 3.3 Linux
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Square boxes in generated PDF

Post by Robert Tucker »

As I posted above, Fedora 17 did the conversion perfectly at run level 3 using jodconverter-core-3.0-beta-4 at command line level. (Also checked same with jodconverter-2.2.2 which again works perfectly at run level 3.)

This thread might be worth investigating:

http://user.services.openoffice.org/en/ ... 44&t=39112

Can you not incorporated some code to list all the fonts available to the Java installation as suggested above and run a macro in OpenOffice to find all the fonts available to it? There's an OpenOffice font listing macro (in Basic) here:

http://www.oooforum.org/forum/viewtopic.phtml?t=14900

which I think would need adapting to run headless and an extension here:

http://extensions.services.openoffice.o ... /TestFonts

Perhaps it would be test enough to open a Hindi .odt document and then save it with OpenOffice running headless.
LibreOffice 7.x.x on Arch and Fedora.
Post Reply