[Solved] Wildcards question: Trimming URLs

Discuss the spreadsheet application
Post Reply
GavinAD
Posts: 3
Joined: Sat Apr 21, 2018 2:32 pm

[Solved] Wildcards question: Trimming URLs

Post by GavinAD »

Hi all,

Long-time lurker here.

I have a document containing 20,000 URLs. They go to specific webpages. I need to cut the URL so that it goes to the page before.

This post before was the closest I could find to my answer for a problem I have: viewtopic.php?f=9&t=58098. However, I need the inverse of the solution provided.

So, for example, in this above URL (viewtopic.php?f=9&t=58098), I need a wild card that cuts out "viewtopic.php?f=9&t=58098" but leaves the remainder (https://forum.openoffice.org/en/forum/)

In the linked post @Acknak offers the solution for the opposite (^.*[/\\]). That works perfectly but it does the opposite of what I need. I've spent a couple of hours fiddling and reading up and still can't work out how to do the inverse.

Would really appreciate any help from the wildcard whizzes out there — and, should you see this @Acknak, thanks a lot for helping the other person; yours gave me hope that it was possible...!

Thanks a lot,

Gavin
Last edited by Hagar Delest on Sun Apr 22, 2018 8:59 pm, edited 1 time in total.
Reason: tagged [Solved].
Apache OpenOffice 4.1.5 ??? either Mac or Windows OSs
FJCC
Moderator
Posts: 9274
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: Wildcards question: Trimming URLs

Post by FJCC »

Try this as your search term.

Code: Select all

[^/\\]+$
It is looking for characters that are not / or \ up to the end of the paragraph.
OpenOffice 4.1 on Windows 10 and Linux Mint
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
GavinAD
Posts: 3
Joined: Sat Apr 21, 2018 2:32 pm

Re: Wildcards question: Trimming URLs

Post by GavinAD »

Lifesaver. Thanks so, so much. I won't pretend to understand what you've written or why it works but appreciate that it does!

Two other things, if anyone knows this: I want to now search for the part at the end of the paragraph between the two slashes and including the final slash.

So for the example above: https://forum.openoffice.org/en/forum/ the search should find only "forum/" - this'll enable me to then go back through the data and work out whether these need to be slimmed back and simply replace them if they're redundant.

Finally, how do find and replace cells that are duplicates within a column? For instance, here, I'd end up with lots of instances of "https://forum.openoffice.org/en/forum/" where I just want one remaining. Thanks!
Apache OpenOffice 4.1.5 ??? either Mac or Windows OSs
GavinAD
Posts: 3
Joined: Sat Apr 21, 2018 2:32 pm

Re: Wildcards question: Trimming URLs

Post by GavinAD »

Managed to solve the first question below, I think:

([^/]+$)

How am I able to remove duplicate entries in the same column please, anyone?
Apache OpenOffice 4.1.5 ??? either Mac or Windows OSs
FJCC
Moderator
Posts: 9274
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: Wildcards question: Trimming URLs

Post by FJCC »

Say your URLs are in A1:A1000. In B1 put the formula

Code: Select all

=COUNTIF(A$1:A1;A1)
and copy it down the column. That will return 1 the first time an entry appears and it will return 2 or more on subsequent entries. You can then filter column B for values of 1 and use Copy and Paste to place those entries elsewhere or you can filter for values greater than 1 and delete those rows. I avoid deleting data, so I would use the first option.
OpenOffice 4.1 on Windows 10 and Linux Mint
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Wildcards question: Trimming URLs

Post by Villeroy »

GavinAD wrote:How am I able to remove duplicate entries in the same column please, anyone?
Select the column
menu:Data>Filter>Standard Filter...
Field name: column x
Condition: =
Value: -- not empty -- (selectable from the list)
[More Options]
[X] No duplicates
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Post Reply