Page 1 of 1

Search and replace this

PostPosted: Wed Jun 13, 2018 6:21 pm
by Ciscokid2
I am trying to quickly edit several hundred email addresses in a text file. Here is a fictitious example.

I want to find and replace the first set of quotation marks and everything following up to and including the > ....(that is all of this " John.Smith"@here.ac.ca>) But the text between the " and the > is different in each address of course, so the text between would have to be taken out by a wildcard. Thanks for any ideas. I can't get my old retired brain around this.

John.Smith@here.ac.ca" John.Smith"@here.ac.ca>

Re: Search and replace this

PostPosted: Wed Jun 13, 2018 6:55 pm
by FJCC
If there is one email address per line then
Code: Select all   Expand viewCollapse view
".+>

with Regular Expressions selected in the More Options area, should do it.

Re: Search and replace this

PostPosted: Wed Jun 13, 2018 9:24 pm
by Ciscokid2
No that did not work. Perhaps I left out too much of the lines. Let me put the whole thing in even though the first mail to part seems ok to leave.

"mailto:John.Smith@here.gc.ca" <"mailto:John.Smith"@here.gc.ca>

now, when I try “.+> to take out the second part " <"mailto:John.Smith"@here.gc.ca> nothing is found. I have selected regular expressions and tried selecting all file and part of it etc. Otherwise find and replace works properly for single words or " >. Those are all found just fine.

I really appreciate someone who can figure this out.

So to be clear, I want all of this to go " <"mailto:John.Smith"@here.gc.ca> including the space after the first set of quotations

Re: Search and replace this

PostPosted: Wed Jun 13, 2018 9:46 pm
by Zizi64
I really appreciate someone who can figure this out.


Please upload an ODF type sample file here without sensitive data, but with same structure as your original file.

Re: Search and replace this

PostPosted: Thu Jun 14, 2018 5:12 am
by MrProgrammer
Ciscokid2 (first post) wrote:John.Smith@here.ac.ca" John.Smith"@here.ac.ca>
Ciscokid2 (second post) wrote:"mailto:John.Smith@here.gc.ca" <"mailto:John.Smith"@here.gc.ca>
Ciscokid2 wrote:I want to find and replace the first set of quotation marks and everything following up to and including the >

You've given us two different examples, one with two quotation marks, one with four quotation marks, one with mailto: one without that.

Which is it? Or do you have both formats?

In the second case, removing (your words) "the first set of quotation marks and everything following up to and including the >" leaves nothing! Is that really what you want?

Attach a document demonstrating the situation (remove confidential information then use Post Reply, not Quick Reply, and don't attach a picture instead of the document itself). Provide enough examples showing what is desired so that there can be no doubt about how to handle each case. I suspect this is simple to do with [Tutorial] Text to Columns but I can't offer more advice without knowing your real data, not some fictitious example, and precisely what result is desired.

Re: Search and replace this

PostPosted: Thu Jun 14, 2018 1:52 pm
by Ciscokid2
Thank you. I was being clear as possible when I said that out of the entire line I wanted this to go " <"mailto:John.Smith"@here.gc.ca> that is the entire part which runs to the end of the line. Thanks if you can figure it out.

Re: Search and replace this

PostPosted: Thu Jun 14, 2018 2:42 pm
by keme
So, based on the examples I can make an educated guess:
  • Outer <> delimiters should persist when present.
  • The mail address within (or without) the <> needs adjustment as follows:
    • everything before the @ should be quoted as one item,
      • leading spaces if present
      • "mailto:" link protocol specifier if present.
      • recipient ID
    • Mail domain specifier after the @ should be kept as is, only stripping quotes where present.
Is that what you need?

Also, do you need to check for matched pairs (quotes and angle brackets) and/or null values in pre-existing entries so we only modify entries where the modification is relevant, or can we assume that source data is continuous and all well structured according to the above description?

Re: Search and replace this

PostPosted: Thu Jun 14, 2018 4:58 pm
by Lupp
I'm somehow baffled now.
There is one format used with the protocol specifier "mailto:!" I know which allows to additionally include a plain name in a specific position excluded from interpretation by a pair straight of doublequotes. Examples for valid email addresses using the "mailto:" are:
mailto:me@somedomain.tld
mailto:<me@somedomain.tld>
mailto:""<me@somedomain.tld>
mailto:"Lupp"<me@somedomain.tld>
I did not research and study specifying sources, but concluded from examples my Thunderbird accepts. If I enter an ordinary email address in Open-/Libre-Office it is recognised (URL recognition enabled) and the link in the background gets automatically prefixed the "mailto:".

Thus an applicable syntax in RegEx should be (a few minor aspects aside):
Code: Select all   Expand viewCollapse view
(^|\W)((mailto:)?[A-Z][A-Z0-9]+@[A-Z][A-Z0-9]{2,}\.[A-Z]{2,}(\W|$)|(mailto:"[^"]*"<)[A-Z][A-Z0-9]+@[A-Z][A-Z0-9]{2,}\.[A-Z]{2,}>)(\W|$)


Please note that the RegEx will not reject entries with a malformed mailto:"??" part ( a missing doublequote e.g.) if then comes a correct match.
I'm interested in leraning from the OQ how I misinterpreted the question and from everybody how to simplify, improve, or aptly critisize the RegEx.

Re: Search and replace this

PostPosted: Thu Jun 14, 2018 5:35 pm
by MrProgrammer
Ciscokid2 wrote:No that did not work.
"It didn't work" isn't helpful in the forum because it tells us what did not happen. Please never use that phrase in a post. We need to know what the data looked like beforehand, exactly what actions you took, what the data looked like afterward, and what you expected to happen.

Ciscokid2 wrote:I was being clear as possible when …
Reading the posts from several volunteers, you should be able to tell that your explanation of the situation has been poor. You risk having people ignore your posts unless you improve them in the future. We're providing free advice, and I, at least, hesitate to waste my time on people who cannot describe the problem unambiguously.

You failed to attach your actual data so I will have to guess at a procedure, and this is my final post in this topic. Experience tells me that often ficticious data does not illustrate the real situation and solutions created for it do not scale to the actual data. Try these two steps:
• Remove all quotation marks from the data using Edit → Find&Replace
• Use Data → Text to Columns → Separated by → Space → First field = TextAll other fields = Hide → OK

I encourage you to read the tutorial I linked above before using Text to Columns. But if you use that feature and make a mess of your data, don't forget you can use Edit → Undo to fix the damage. If you need additioanl assistance with Find&Replace or Text to Columns read about those topics in Help → Index or in User Guides (PDF) or search for topics about them in the Calc Forum. Bye.

If this solved your problem please go to your first post use the Edit button and add [Solved] to the start of the title. You can select the green checkmark icon at the same time.

[Tutorial] Ten concepts that every Calc user should know

Re: Search and replace this

PostPosted: Thu Jun 14, 2018 8:48 pm
by Bill
Please upload a sample file. I submitted a reply and deleted it because it was based on assumptions which may or may not be true. I will not make any more guesses without a sample to test.

Re: Search and replace this

PostPosted: Tue Jun 19, 2018 4:55 pm
by Ciscokid2
Thank you. No, what you have is the correct format to delete....that is the sample....that is why the John Smith is in there in place of the elected politicians which would make up this large mailing list when edited with search and replace.

But there is a vitriol in this forum. I don't want to get in the line of fire. Just say it can't be done and we will leave it at that.

Thanks.

Re: Search and replace this

PostPosted: Tue Jun 19, 2018 8:37 pm
by Zizi64
...that is the sample...


A real ODF type file - uploaded it here: THAT IS the sample. The pictures, textual samples, and other attahments can give us SOME informations. But a real file give us ALL OF the available information about your/software issues...
Not needed to upload your original file with the original data. Change some data to "dummy" data, delete the sensitive data from a copy of your file, and then upload it here. The file size limit is 128 KiB in this Forum.

Re: Search and replace this

PostPosted: Wed Jun 20, 2018 9:54 am
by Bill
Ciscokid2 wrote:Just say it can't be done and we will leave it at that.

Search for: [:space:].+ works for me in the sample document I created, but since I don't have a sample of your document to test, it may or may not work or might just completely destroy your document.