Automatically wrapping URLs after slashes, no hyphenation

Writing a book, Automating Document Production - Discuss your special needs here
Post Reply
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

At the end of a line, is it possible to automatically wrap URLs after slashes, after slashes only, and avoid hyphenation?
OOo 2.4.X on Ubuntu 8.x
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by acknak »

Does OOo actually hyphenate the names in a URL?

You can prevent that by selecting the URL, Format > Character > Font > Language: none.

Using OOo 3 at least, it seems that URLs are normally broken at the slashes.

Can you move the URL into a footnote or something? Putting that ugly computer-code address in your running text is not so nice for the reader.
AOO4/LO5 • Linux • Fedora 23
udippel
Posts: 39
Joined: Tue Feb 19, 2008 9:09 am

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by udippel »

acknak wrote:
You can prevent that by selecting the URL, Format > Character > Font > Language: none.

Using OOo 3 at least, it seems that URLs are normally broken at the slashes.
Where? I find the literature for our book looking very ugly. The publisher wants justification,and this results in many occurrences of like,one or two words in a line, followed by a huge empty space(s) and then the complete URL. I can't seem to confirm, on OpenOffice 3, that the slashes were used for hyphenation. Instead of a test case, I simply suggest to take any URL, copy and paste, add a blank, paste that URL, add a blank, paste that URL a third time.
All URLs here are left aligned, whatever I do with Default Format and/or Language: none.
The only way as of now is the artificial insertion of Ctrl-'-'; leading to wrong URLs, with additional dashes.
Hyphenation at slashes would be the desired behaviour, though.

Uwe
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

Uwe -- for URLs in running text, I suppose you can simply hit shift+enter after the last slash that fits on a line, thus using a soft line break.

My own problem doesn't involve running text: I have a few academic papers with truckloads of long URLs in the reference lists. As it is, URLs start on a new line if they don't fit on the old one. I'd like those URLs to start on the old line and wrap automatically after the last slash that fits on the old line.

Also, for the purpose of this discussion, can we agree that the term hyphenation assumes that a hyphen gets inserted to mark a word division? In this sense, I want to prevent hyphenation as I don't want characters in an URL that don't belong there.

I'm planning to upgrade to OOo 3.0 sometime soon, but I will need to do this slowly and do some testing first. I'm using Zotero [http://zotero.org] as a citation manager, and its word-processing plugin is not very robust at all: it's working for me now, so I'm reluctant to mess with it.

In other words: I'm looking for a quicker fix than an upgrade.
OOo 2.4.X on Ubuntu 8.x
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by acknak »

It might be interesting for you to attach a page from your reference list. It doesn't have to have all the document formatting, just the URLs, then we can compare how the same strings break in 2.x compared to 3.x. I'm thinking that would be the best way to tell whether the problem would be fixed by an upgrade.

BTW, you can run 2.x and 3.x at the same time--they don't interfere--so, you could try OOo 3 without any risk. Just don't save your file from OOo 3; that will change the file format.
AOO4/LO5 • Linux • Fedora 23
udippel
Posts: 39
Joined: Tue Feb 19, 2008 9:09 am

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by udippel »

Anark wrote:Uwe -- for URLs in running text, I suppose you can simply hit shift+enter after the last slash that fits on a line, thus using a soft line break.

My own problem doesn't involve running text: I have a few academic papers with truckloads of long URLs in the reference lists. As it is, URLs start on a new line if they don't fit on the old one. I'd like those URLs to start on the old line and wrap automatically after the last slash that fits on the old line.

Also, for the purpose of this discussion, can we agree that the term hyphenation assumes that a hyphen gets inserted to mark a word division? In this sense, I want to prevent hyphenation as I don't want characters in an URL that don't belong there.
Anark, we seem to suffer from the exact same problem. We are going to have our most recent book published, and the references containing URLs are mostly just butt-ugly.
I agree fully with you:
1. Of course, we want to prevent hyphenation in the sense of characters that don't belong there. What I'd be looking for would be hyphenation without visible hyphen.
2. As our publisher insists on justified layout, the URLs starting at a new line usually leave like a few words only stretched across the line above like ****.
3. I'd want automatic 'hyphenation' - or better: line-wrap - considering the slashes as medium of choice to break the line.

Do I understand correctly what you were asking?

Uwe
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

udippel wrote:Do I understand correctly what you were asking?
Yes, I think we're struggling with the same issue.
acknak wrote:It might be interesting for you to attach a page from your reference list. It doesn't have to have all the document formatting, just the URLs, then we can compare how the same strings break in 2.x compared to 3.x.
Thanks for the offer! Just copy the URL at the top of this page, paste it into a new .odt document, then push it to the right using a few tab stops. If you get the URL to break automatically after a slash, only after a slash, and with no hyphen added, then we'd both be happy to learn how you do it.
OOo 2.4.X on Ubuntu 8.x
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by acknak »

Sorry, no luck. It breaks between any characters in the URL.
AOO4/LO5 • Linux • Fedora 23
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

Thanks for trying.

I suspect that this would need to be put to the developers as a feature request.
OOo 2.4.X on Ubuntu 8.x
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by acknak »

Yes, I thought there must be one already filed, but I've not been able to find a request concerning how URLs should break across lines.
AOO4/LO5 • Linux • Fedora 23
udippel
Posts: 39
Joined: Tue Feb 19, 2008 9:09 am

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by udippel »

acknak wrote:Sorry, no luck. It breaks between any characters in the URL.
If only it did. You are not completely correct, on OpenOffice3 it is not 'any'. The URL itself does not line-break. It is - as I mentioned - either here or there. It only breaks at punctuation.
I have included a document with some of the references that we used; and included comments. (I also added it as PDF, in case, only in case, it might look different on a different version of OpenOffice, with a different dictionary or so.)

Spontaneously, I feel we need a sibling to hyphenation, for URLs, with rules like:
1. Never hyphenate
2. Wrap the line before any forward slash instead
3. Don't wrap elsewhere.
4, If you hit a dash, put that dash on the next line [to avoid its interpretation as a hyphenation]

What do the others think?

Uwe
Attachments
Demo_Hyphenate.pdf
(81.45 KiB) Downloaded 528 times
Demo_Hyphenate.odt
(12.32 KiB) Downloaded 458 times
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

I wonder if this could be handled in a macro.
udippel wrote:2. Wrap the line before any forward slash instead
My own intuition would be to break after a hyphen. A casual Web search reveals that style guides disagree on this point; you will find, "Only break a URL that goes to another line after a slash or before a period. Do not insert a hyphen", and then you will find, "If you must put a site on two lines, break before a slash or dot". One would need to check if the biggies (MLA, APA, Chicago, Harvard, etc.) really differ on this or if there's a consensus one way or the other.
udippel wrote:3. Don't wrap elsewhere.
4, If you hit a dash, put that dash on the next line [to avoid its interpretation as a hyphenation
Rule 4 would be redundant as a special case of rule 3, which is a re-statement of rule 2.

If you wanted to cover all bases, you'd need a rule about what to do with URLs that go on for miles without a slash, such as the notoriously cruddy ones generated by Amazon.com. Here's a random example:

http://www.amazon.com/gp/feature.html/r ... d_i=507846

[It turns out the board software shortens the URL. To view the address in its fully glory, either follow the link or mouse it over and check your browser's status bar.]
OOo 2.4.X on Ubuntu 8.x
udippel
Posts: 39
Joined: Tue Feb 19, 2008 9:09 am

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by udippel »

Anark wrote:I wonder if this could be handled in a macro.
udippel wrote:2. Wrap the line before any forward slash instead
My own intuition would be to break after a hyphen. A casual Web search reveals that style guides disagree on this point; you will find, "Only break a URL that goes to another line after a slash or before a period. Do not insert a hyphen", and then you will find, "If you must put a site on two lines, break before a slash or dot". One would need to check if the biggies (MLA, APA, Chicago, Harvard, etc.) really differ on this or if there's a consensus one way or the other.
udippel wrote:3. Don't wrap elsewhere.
4, If you hit a dash, put that dash on the next line [to avoid its interpretation as a hyphenation
Rule 4 would be redundant as a special case of rule 3, which is a re-statement of rule 2.

If you wanted to cover all bases, you'd need a rule about what to do with URLs that go on for miles without a slash, such as the notoriously cruddy ones generated by Amazon.com.
I hope someone high enough reads this. I tried on Word2007, and it behaves close to the current behaviour of OO. If these steps were to be implemented, OO could surpass WORD in this aspect; which in itself is a good sales argument.

I do not fully agree with your comment here, in that
2. the break should happen after a slash. Have a look at my samples. If a new line starts with a slash, it reads much easier and faster as 'continuation of a URL'.
3. my examples clearly show wrap at '1', as well as after '-'. This is not really a restatement of 2, which deals with slashes only
4. is a special case of dash. I suggest that we may break, but with the dash on the new line. Argument is similar to beginning a new line with a slash: when a new line starts with a dash followed immediately by a printable character, it can be easily identified as 'continuation of a URL'

I did not go as far as your good example of a URL stretching beyond a line is concerned. Personally, and as far as our upcoming book is concerned, an implementation of the four rules would have already saved us quite a lot of hassle.
I guess the simplest approach would be for the coders to reuse the URL-detection, but instead of underlining and blue, apply a specific set of wrapping rules. Why not allow the users to either select a citation style; and/or an applet for wrapping of URLs, where you could click 'after slash' and we clicked 'before slash'.
I think we here would be all too happy to simply wrap all links extending for more than e.g. 1.5 lengths of a line by simply wrapping wherever the line(s) end(s).

Uwe
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

Uwe -- if you're in charge of shipping repro-ready copy to a small-time publisher that won't enforce any style guidelines, then I suppose you're free to decide where the slashes go. Big-time publishing operations do enforce styles, however, either in-house styles or standard style guides. Two such widely observed academic standards are MLA in the humanities and APA in the social sciences. Here's an official MLA page that states URLs must be broken after slashes:

http://www.mla.org/style_faq4

Here's a page that looks like a faithful transcript of the APA rules, and it says that URLs must be broken after slashes:

http://www.bedfordstmartins.com/online/cite6.html

Now, you seem to be a native speaker of German, and for anything I know Duden, or whoever makes the rules in German style guides, may in fact say that URLs need to be wrapped before a slash. If so, and if you publish in German, then your line breaks will of course go before the slashes.

I agree that having a solution that won't handle Amazon-style URL cruft would be preferable to having no solution at all. The APA rule (see link above) of wrapping before a period might be preferable to your idea of wrapping before a hyphen, or, if implemented in addition to your rule, might reduce instances where any hyphen appears anywhere near a line break.

Again, I believe a Basic or Python coder could do this in a macro, which would be quicker than having it implemented as a regular feature in a future OOo release. It might still be quicker simply to remove justification from the References style in your document template, left-align your references, and break the URLs manually using shift+enter.

Or that's what I'm going to do, anyway, unless a better idea should come along.
OOo 2.4.X on Ubuntu 8.x
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

Okay -- here's an official APA page, which, weirdly, doesn't say anything about breaking before or after a slash:
The domain name extension (in the preceding example, ".org") can help you determine the appropriateness of the source for your purpose. Different extensions are used depending on what entity hosts the site. For example, the extensions ".edu" and ".org" are for educational institutions and nonprofit organizations; ".gov" and ".mil" are used for government and military sites, respectively; and ".com" and ".biz" are used for commercial sites. Domain name extensions may also include a country code (e.g., ".ca" for Canada or ".nz" for New Zealand).

The rest of the address indicates the directory path leading to the desired document. This part of the URL is case sensitive; transcribe the URL correctly by copying it directly from the address window in your browser and pasting it into your working document (make sure the automatic hyphenation feature of your word processor is turned off). Do not insert a hyphen if you need to break a URL across lines; instead, break the URL before most punctuation (an exception would be http://). Do not add a period after the URL, to prevent the impression that the period is part of the URL.
Source: http://www.apastyle.org/elecmedia.html
OOo 2.4.X on Ubuntu 8.x
udippel
Posts: 39
Joined: Tue Feb 19, 2008 9:09 am

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by udippel »

Anark wrote:Uwe -- if you're in charge of shipping repro-ready copy to a small-time publisher that won't enforce any style guidelines, then I suppose you're free to decide where the slashes go. Big-time publishing operations do enforce styles, however, either in-house styles or standard style guides. Two such widely observed academic standards are MLA in the humanities and APA in the social sciences. Here's an official MLA page that states URLs must be broken after slashes:

http://www.mla.org/style_faq4

Here's a page that looks like a faithful transcript of the APA rules, and it says that URLs must be broken after slashes:

http://www.bedfordstmartins.com/online/cite6.html

Now, you seem to be a native speaker of German, and for anything I know Duden, or whoever makes the rules in German style guides, may in fact say that URLs need to be wrapped before a slash. If so, and if you publish in German, then your line breaks will of course go before the slashes.

I agree that having a solution that won't handle Amazon-style URL cruft would be preferable to having no solution at all. The APA rule (see link above) of wrapping before a period might be preferable to your idea of wrapping before a hyphen, or, if implemented in addition to your rule, might reduce instances where any hyphen appears anywhere near a line break.

Again, I believe a Basic or Python coder could do this in a macro, which would be quicker than having it implemented as a regular feature in a future OOo release. It might still be quicker simply to remove justification from the References style in your document template, left-align your references, and break the URLs manually using shift+enter.

Or that's what I'm going to do, anyway, unless a better idea should come along.
Honestly I don't know about the style Duden prescribes.
Héhé, the left-align is what we had been doing; with exactly that argument. Only, our publisher has thrown a bunch of items at us, trying to hurt us. ;) They couldn't bother less (about extra-hyphens and whatnot), as long as the overall visual beauty is unperturbed. I called it 'slaughtering' the URL; and I would take exception at slaughtering URLs pointing to our work in return.
And the same goes for hyphenation, that I had deactivated; and being admonished for 'lack of knowledge' on the 'beauties' of 'modern wordprocessing'. And we have > 500 references; and I don't feel like manually adjusting one by one.

Thanks anyway, and keep me updated when a macro, or better a regular implementation, becomes available. Too late for this publication; but useful for future work.

Uwe
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

OOo 2.4.X on Ubuntu 8.x
Anark
Posts: 10
Joined: Sat Jan 24, 2009 7:28 pm

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by Anark »

P.S. You can vote for this issue if you think the OOo developers should prioritize it:

http://www.openoffice.org/issues/show_bug.cgi?id=98482

You will need to register, though.
OOo 2.4.X on Ubuntu 8.x
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Automatically wrapping URLs after slashes, no hyphenation

Post by acknak »

Good! Thanks for posting the link.
AOO4/LO5 • Linux • Fedora 23
paenson
Posts: 2
Joined: Wed Feb 24, 2010 12:20 am

Re: Automatically wrapping URLs after slashes, no hyphenatio

Post by paenson »

There is a fairly simple solution:

1) In menu Options/Language settings/languages activate "CTL" (using "none" as default works)
2) Place cursor before or after a "/" in one of your URLs (depending on your taste) and insert a "Formatting Mark"; choose "No-width optional break"
3) Mark both characters and copy them into memory
4) In menu "Find and Replace" replace all your / with the two characters
5) For the "http://" part do the opposite, i.e. replace your succession of four characters (looks a bit like: "/,/,") with just two //
OpenOffice 3.2 on Windows XP
TAB
Posts: 283
Joined: Sun Feb 24, 2008 5:04 am

Re: Automatically wrapping URLs after slashes, no hyphenatio

Post by TAB »

acknak wrote:Using OOo 3 at least, it seems that URLs are normally broken at the slashes.
In OO4.0.1, URLs are not broken at slashes, but they are at ... hyphens! As in URLHyphen.odt:

http://www.google.ht/search?q=champ+mag ... t=firefox-
a&hs=yxx&rls=org.mozilla:en-
US:official&prmd=imvns&tbm=isch&tbo=u&source=univ&sa=X&ei=gZkZT6bUFsbZ0QH4gvCdCQ
&ved=0CEUQsAQ&biw=1172&bih=653

A URL, being an single string (URLHyphen.txt), would look better if all lines were filled; ie, not broken at hyphens.
Attachments
URLHyphen.txt
long URL, whole
(221 Bytes) Downloaded 267 times
URLHyphen.odt
long URL, formated
(10.44 KiB) Downloaded 312 times
Post Reply