Page 1 of 1

[Solved] En-dash in pdf changes to em-dash when printed

Posted: Sun Dec 16, 2018 11:03 pm
by EinarFlydal
I have written a long text, split in 3 parts (separate files) with OOWriter. They were then converted to pdf, and sent to a printing house, and the book was printed.
File 1 comes out correctly, with m dashes (U+2013) with space before for and after, as in this text:

"Europas første UMTS-nettverk – det som nå er kjent som "3G", en forkortelse for "tredje generasjon" – ble tatt i bruk høsten 2002."

File 2 however, comes out with em dashes (U+2014) when printed, and spacing is left out after the dash, as here:

"Europas første UMTS-nettverk —det som nå er kjent som "3G", en forkortelse for "tredje generasjon" —ble tatt i bruk høsten 2002."

No difference, however, is seen in the files, neither in the odt-files, nor in the pdf-files.

The conversion happens when the pdf-files are printed in the printing house. Now, a second edition of the book is to appear, and this should be corrected. Is anyone able to explain how this can possibly happen, and how to adress the problem?

Einar

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 12:21 am
by Lupp
(Deleted)
This was a too rash and rather useless attempt to help.
The answer by John_Ha below tells much more precisely what I had in mind, and comes from a contributor much better informed.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 12:43 am
by EinarFlydal
Dear Lupp,

"Someone might analyse the pdf to find ou if there something went wrong during export from Writer to Pdf.
Without having this way as a possibility I see no chance to find the reasons."

Any suggestions who would have interest in doing this?

Einar

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 12:50 am
by Bill
Are you using a style guide that requires spaces before and after em dashes? Most of the advice I've found online says not to use spaces before or after en dashes, em dashes or hyphens.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 1:55 am
by John_Ha
Did you use AOO Export to PDF for both?

Or did you use a Virtual PDF Printer like PrimoPDF for one?

Open the PDFs in Adobe Reader and go File > Properties > Description ..., to see which application created the PDF and which PDF version was created.
Clipboard01.gif
AOO on a Mac works differently from AOO on Windows. On a Mac, going File > Export as a PDF ..., uses one method and one piece of software. Clicking the PDF icon uses a different method and different software. One method creates files with ligatures and one does not but I forget which is which.

Please upload two small example .odt files showing the problem so that it can be analysed. Use the Upload attachment tab below where you type (128 kB max); or use a file share site, Dropbox or Google Drive for a larger file. Also upload the PDF files created from the files. As you can only upload three items per post you will need to post twice.
EinarFlydal wrote:The conversion happens when the pdf-files are printed in the printing house.
On second thooughts it is probably caused by the printer. Talk to them to see what they do. Do they have a setting which deletes redundant spaces? Have they changed anything since the last one was printed? Do some simple one page printing tests.

It should not be a "missing font" problem because AOO embeds the used fonts in the PDF. That being said, remember that the font showing in the Writer font drop-down selection box is the font the document is asking for. This may NOT be the font being used to create the display because, if the font being asked for is not installed on the PC, Windows (or other operating system) will silently substitute a different font which is available. This may be wrong but imagine you are calling for font Fred, but you don't have Fred, so font Tom is used instead to display the document (and Tom is presumably embedded in the PDF??). If the printer has Fred it will be printed with Fred??

The TestFonts add-on is invaluable for finding missing fonts which the document is asking for, but which are not installed on the PC.

You can see which fonts are installed on the PC by Start > Control Panel > Fonts or by clicking C:\Windows\Fonts.

Try brute force and ignorance :super:

Insert a character after the dash and set it to white - it will not print but it will occupy the space. Or try inserting the preceding and following spaces as protected or non-breaking spaces. Does that help? Help says:
Inserting Protected Spaces, Hyphens and Conditional Separators

Non-breaking spaces

To prevent two words from being separated at the end of a line, hold down the Ctrl key and the Shift key when you type a space between the words.

In Calc, you cannot insert non-breaking spaces.

Non-breaking dash

An example of a non-breaking dash is a company name such as A-Z. Obviously you would not want A- to appear at the end of a line and Z at the beginning of the next line. To solve this problem, press Shift+Ctrl+ minus sign. In other words, hold down the Shift and Ctrl keys and press the minus key.

Hyphen, dash

In order to enter longer dashes, you can find under Tools - AutoCorrect Options- Options the Replace dashes option. This option replaces one or two minus signs under certain conditions with an en-dash or an em-dash (see OpenOffice Help).
For additional replacements see the replacements table under Tools - AutoCorrect Options - Replace. Here you can, among other things, replace a shortcut automatically by a dash, even in another font.

Definite separator

To support automatic hyphenation by entering a separator inside a word yourself, use the keys Ctrl+minus sign. The word is separated at this position when it is at the end of the line, even if automatic hyphenation for this paragraph is switched off.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 1:55 pm
by EinarFlydal
Bill and John Ha,
thank you for taking your time! See the attached screen dumps and answers below.

Bill,
quote: "Are you using a style guide that requires spaces before and after em dashes? Most of the advice I've found online says not to use spaces before or after en dashes, em dashes or hyphens."

In both of the pdfs, the m dashes show correctly. The conversion of the m dashes to the even longer em dashes and the removal of space after the dash happens when printing just the one of the two pdfs. Hence, I suppose it cannot be the style guide nor the Exchange table in OOWriter? Or do you disagree? If so, where in the style guide could such a requirement be stated? The replacement table is empty.

John Ha,
[quote="John_Ha"] Did you use AOO Export to PDF for both? Or did you use a Virtual PDF Printer like PrimoPDF for one?
YES, I used AOO Export to PDF for both files.


"AOO on a Mac works differently from AOO on Windows."
I used Windows. .
In both of the pdfs, the dashes show correctly. The conversion of the dashes to the even longer em dashes and the removal of space after the dash happens when printing just one of the two pdfs. Hence, I suppose it cannot be the style guide nor the Exchange table in OOWriter? Or do you disagree?

"Please upload two small example .odt files showing the problem so that it can be analysed. ... Also upload the PDF files created from the files."
I cannot forward entire files of the book. See attachments, which are
3. sample from .odt file before being converted to pdf, checked for hidden signs, with correct dashes,
2. a new pdf created, with correct dashes, and
1. an image of the corresponding page in the printed book - with em dashes and no space after. Sorry for upside down!


"On second thooughts it is probably caused by the printer. Talk to them to see what they do. Do they have a setting which deletes redundant spaces? Have they changed anything since the last one was printed? Do some simple one page printing tests."
The two files + two more are different parts of the same book. They are merged by the printer before printing. Only one of the files has the problem of creating these em dashes. Hence, it must be something in the PDF, I suppose?

"It should not be a "missing font" problem because AOO embeds the used fonts in the PDF. "
The font is installed on my PC, and the PDF confirm that the font is in the PDF.


"Try brute force and ignorance: Insert a character after the dash and set it to white - it will not print but it will occupy the space. Or try inserting the preceding and following spaces as protected or non-breaking spaces. Does that help?"
This I cannot do, unless as a last try. It gets to complicated to make such experiments with the printer.

As to the Tools - AutoCorrect Options- Options the Replace dashes option:
I suppose that after such replaclement is done, there is no code left in the text. Hence, there should not be any code in the PDF that could alter the printing? Or?

Thank you for your patience!

Einar

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 2:28 pm
by John_Ha
We need the actual .odt and .pdf files please! Just the one page is required so delete all but the page from the .odt file. You can strip a single page from a PDF with PDFSam.
EinarFlydal wrote:In both of the pdfs, the m dashes show correctly. The conversion of the m dashes to the even longer em dashes and the removal of space after the dash happens when printing just the one of the two pdfs.
So we have PDF 1 and PDF 2. Both view correctly on your PC. But they print differently at the printer.

What happens when you print the pages on your printer? You can select a single page from a PDF to print it.
EinarFlydal wrote:Hence, I suppose it cannot be the style guide nor the Exchange table in OOWriter? Or do you disagree? If so, where in the style guide could such a requirement be stated? The replacement table is empty.

I don't know much about PDFs but I strongly suspect that the printer is doing this. If the PDF displays correctly then it should print correctly.

I have sent you my email ID so you can send me the .odt and the .PDF files.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 2:43 pm
by EinarFlydal
The dashes come out correctly when I print them on my own home printer.
If I have understood rightly, the pdf-format should not possibly contain instructions or code that could make the space+dash+space come out as space+dash.
Is that rightly understood?

If this is enough proof, I do not have to send the files. So I wait for your answer.
Einar

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 2:53 pm
by John_Ha
EinarFlydal wrote:If I have understood rightly, the pdf-format should not possibly contain instructions or code that could make the space+dash+space come out as space+dash.
Is that rightly understood?
As I said above I don't know much about PDF files but I am 99% certain that is correct. I wanted to see the .odt files so I could see how the dashes and spaces were encoded in the XML.

Try asking on an Adobe forum - they will almost certainly have come across it before.

For clarity, there are hyphens, en-dashes (the width of the letter n) and em-dashes(the width of the letter m). English style guides are definitely of the view that there should only be dashes spaces around an em-dash in exceptional circumstances; and never around hyphens or en-dashes. See Hyphen, en-dash, em-dash

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 3:17 pm
by EinarFlydal
Thank you for clarification.
My intention is to have en-dashes, not em-dashes, throughout the text, i.e. U+2013.
In the language of this text, Norwegian, we also stick to the English rule, which is how the en-dashes are supposed to be used throughout the text:

4. SPACES AND DASHES
... The only exceptions are en-dashes used to mark a break in thought and change in sentence structure.

As private email, a link is sent to a Dropbox adress where you will have access to the files.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 7:42 pm
by John_Ha
EinarFlydal sent me his .odt files and the PDFs created from them.

First, I am out of my depth here - I don't know much about this so take anything I write with a large pinch of salt. Secondly, I do not have the Sabon MT Pro font being used so I am using a substitute font and when I insert an en- or em-dash it gets inserted with Tahoma so it is impossible for me to compare. I do not know a method where I can select a character and say "what is this character's Unicode?"

I cannot see a problem on page 194 of the .odt file, or in a PDF created from page 194, which would cause it to print without the space. The XML is as expected with the space. There are many unaccepted changes (Edit > Changes > Accept or reject) in Chapter 15.
Photocopy of printed page 194 - note missing space after right dash
Photocopy of printed page 194 - note missing space after right dash
Clipboard01.gif (11.4 KiB) Viewed 4862 times
content.xml showing the space.
content.xml showing the space.
xml.gif (8.94 KiB) Viewed 4862 times
PDF created from .odt file showing the space
PDF created from .odt file showing the space
pdf.gif (13.31 KiB) Viewed 4862 times
I think the only way to solve this is in discussion with the printer. You need to create some simple .odt test files with only two or three lines of text in them which demonstrate the problem. Copy the lines from your huge documents to make the test files.

A random thought. You are using justified left and right text and the problem is at the end of a line. I wonder if ity still happens if you don't justify, or if it was not at thye end of the line.

Another. The printer presumably has Adobe Acrobat whichg can edit PDF files. Did (s)he edit the PDF file and delete the space from the PDF (s)he printed from?

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 9:50 pm
by John_Ha
Another workaround if you cannot resolve it with the printer.

Insert a small rectangle after the dash, with Line set to None, and Area set to No fill, which makes it invisuible. Anchor it AS a character which means it will always precede the following letter. See image below - top shows the rectangle when selected and bottom shows how it appears normally. It will not affect any spellcheck which adding a white character of the correct width would do.

That being said I think the problem is something which happened in the past and will be impossible to track down. You should therefore create a new PDF from the .odt file and ask the printer to print a proof copy. I am sure it will be OK.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 9:56 pm
by Bill
John_Ha wrote:I do not know a method where I can select a character and say "what is this character's Unicode?"
In this case, you can use your Web browser. Just copy the character and paste it in a Google search box. There are also Unicode lookup sites where you can paste characters and get the Unicode number.

https://unicodelookup.com/

Another thought: some Norwegian style guides that can be found online say that em dashes are never used in Norwegian, so is it possible that the printer is printing two en dashes instead of an em dash?

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 10:20 pm
by Lupp
Bill wrote:
John_Ha wrote:I do not know a method where I can select a character and say "what is this character's Unicode?"
In this case, you can use your Web browser. Just copy the character and paste it in a Google search box. ...
If I already had LibO or AOO open I would get an empty spreadsheet and copy the character into a cell, say A2, and the formula

Code: Select all

="U+" & DEC2HEX(UNICODE(A2);4)
into the next cell. No google, no something.

In recent LibO there is the additional feature to toggle between any character (left of the insertion cursor) and its unicode by Alt+X in each module of the application. The "U+" is added even if it not was entered originally, and the HEX digits A through F are given in lower case.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 10:56 pm
by EinarFlydal
Thank you all for your help! It will take some time to work it through!

Einar

Re: m dash in pdf changes to em dash minus space when printe

Posted: Mon Dec 17, 2018 11:54 pm
by John_Ha
Bill, Lupp

Thanks. I couldn't get the Unicode lookup site to work after the first time as, for some reason, each later attempt got results like %E2%80%93.

However, Lupp's spreadsheet worked a treat and showed they are both en-dashes U+2013. I have uploaded the paragraph with the two en-dashes as two en-dashes.odt.

Re: m dash in pdf changes to em dash minus space when printe

Posted: Tue Dec 18, 2018 2:00 am
by MrProgrammer
John_Ha wrote:Thanks. I couldn't get the Unicode lookup site to work after the first time as, for some reason, each later attempt got results like %E2%80%93.
I had no diffitulty with the site, as long as I pasted the character into the web page's dialog box. However if I put the character in the URL, I also received %E2%80%93. E2 80 93 is the UTF-8 encoding for the EN DASH. The sixteen bits of hexadecimal 2013 are divided into groups of four, six, and six. The groups are used to form three bytes in the UTF-8 encoding. The prefixes for the three bytes 1110, 10, and 10, are specified in the encoding rules.

U+2013 → 0010 0000 0001 0011 → UTF-8 1110 0010 1000 0000 1001 0011 → E2 80 93

Re: m dash in pdf changes to em dash minus space when printe

Posted: Tue Dec 18, 2018 11:15 am
by EinarFlydal
Thanks again to all of you!
I will go through your suggestions and work myself through them together with the printer. So, at least for now, I mark the subject as "Solved".
Einar