Find Duplicate Words

Discuss the word processor
Post Reply
bobgrey1997
Posts: 1
Joined: Tue Oct 03, 2017 9:55 am

Find Duplicate Words

Post by bobgrey1997 »

I have been working on some coding for hundreds of files in relation to towns and cities in my state. There are many towns that sit on a county border, and therefore, show up multiple times. I need to find all of these so that I can change them slightly as to not cause conflicts in the files. The program I have been using has no feature to find duplicated words, and while doing some research, I found that OpenOffice Writer has a feature: paste the entire file into Writer, use Find and Replace, search for \b(\w+)\s+\1\b enable "Regular Expression" in "More Options" and Find All. After reading that, I went to download the entire OpenOffice package for that one feature. When I try to use it, "Search key not found"!
I even made a new document and wrote "The The" and tried searching. "Search key not found"!
Every thread I have seen on this states to use this same search phrase. They are also from before 2010. This no longer works, so how do we find duplicate words? I have also found a solution to use the spreadsheets to go through a long process of formulas to remove the duplicates, but this will not help. I do not want to remove them, I want to find them so that I can change them slightly.
OpenOffice 4.1.3, Windows 10
jrkrideau
Volunteer
Posts: 3816
Joined: Sun Dec 30, 2007 10:00 pm
Location: Kingston Ontario Canada

Re: Find Duplicate Words

Post by jrkrideau »

Try "Find All"

It would help to know what program the data came from and what language they are in.
LibreOffice 7.3.7. 2; Ubuntu 22.04
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Find Duplicate Words

Post by acknak »

The pattern you mentioned works fine for me.

Steps to test it:

Start OpenOffice. Make sure that you have the current version (4.1.3; see Help > About ...)
File > New > Text Document
Type: dt
Press F3 to get a "dummy text" paragraph.
Duplicate some word in the paragraph, say ... He heard quiet quiet steps behind him.

Edit > Find & Replace
Search for: \b(\w+)\s+\1\b
Replace with:
Options/Regular expressions: ON

Click Find or Find All

If that does not work, then something's wrong with your install/setup, or you've missed a step somewhere.

If that works but your document does not, then your document may contain characters that don't match the pattern. You can try relaxing it a bit with something like this: \b(\w+)(\W+\1\b)+
If you try that, make sure to set Match case: ON

That pattern will match one or more "non-word" characters between the duplicates.

If none of that helps, then maybe you can create a small sample document with a bit of the text you're working with and attach that here.
AOO4/LO5 • Linux • Fedora 23
Post Reply