Page 1 of 1

Search for Phrase

Posted: Fri Apr 18, 2008 10:44 pm
by bhmt
The forum software supports searching for any of the words submitted, or for all of the words submitted.

1. Does it support searching for a phrase? Like "user directory" verses those two words anywhere in the body of the message? I tried using quotation marks but that doesn't work.

2. Does it support any type of negation? Perhaps you want messages with XP but not if they also mention Vista? Like (XP not Vista )?

Thanks.
 Edit: searching from off site using Google works.

For example, wanting my phrase "user folder",

http://www.google.com/search?q=site:use ... %20folder"

And I think negation is supported with a " - ", as in user -folder

In Firefox, with its search window, you'd only type in "user folder" or user - folder,
so it is very easy.


Bh
 

Re: Search for Phrase

Posted: Fri Apr 18, 2008 10:56 pm
by Villeroy
Use google (or your preferred search engine).
Search box: "X11 starts with command and waits for timeout" site:user.services.openoffice.org

URL:http://www.google.de/search?q=%22X11+st ... %3Dlang_en

Re: Search for Phrase

Posted: Sat Apr 19, 2008 12:21 am
by acknak
It would be really sweet to have a "Search this site with Google" link or button right on the search page.

You can make your own keyword bookmark to do that in Firefox, but it's still not as nice as having a form or button on the page.

Re: Search for Phrase

Posted: Sat Apr 19, 2008 1:15 am
by bhmt
Thanks guys. I think "X11" was for someone else?

But the Google and "...site:user..." works nice (on quick try, anyway). I use that on Solveig's site, too. Didn't think of it here.

I know I can drag a static URL onto the FF bookmark toolbar, but not how to enhance them. That is, hard coded for a site but with replaceable target "question". But, there's a site that does that:

Bookmarklets.com

http://www.bookmarklets.com/mk.phtml

Which works nice (for most sites that put the code into the address bar). Builds the URL in javascript and you drag it to your toolbar. It grabs what you might have highlighted on the web page as your "question," and if nothing is highlighted it pops up a dialog for you to type into.

But is a Java applet thingy. (It still says "IE and ...Netscape" after all these years.

Thanks again

Re: Search for Phrase

Posted: Sat Apr 19, 2008 1:34 am
by Villeroy
Thanks guys. I think "X11" was for someone else?
I tested a google search on this site with the subject line of an arbitray old thread. It must not be a new thread because that would not be indexed by Google.

If you want to search all well known OOo-sites try http://search.oooninja.com by this forum's most valued member AndrewZ. OOops, seems to be down at this moment.

Re: Search for Phrase

Posted: Sat Apr 19, 2008 2:14 am
by bhmt
Villeroy wrote:
Thanks guys. I think "X11" was for someone else?
I tested a google search on this site ....
<smacks head>

Thanks Villeroy! I locked onto "X11" and immediately figured an answer composed off site was pasted into the wrong message by accident. Focused on the tree and missed the forest. :--(

Re: Search for Phrase

Posted: Sat Apr 19, 2008 2:15 am
by DrewJensen
To the best of my knowledge searching for phrases is not supported by the phpBB search function at all.

Negation is a feature that is just busted. I have a ticket in for that with the phpBB folks, but it seems that it is getting zero attention ( supposedly because it only effects the installations using postgreSQL as the database.. I am afraid it is going to be up to us to fix this. arrrgh.

About the google search option that is something we could do for sure. Actually that is something that I think we should do.

Re: Search for Phrase

Posted: Sat Apr 19, 2008 3:22 am
by bhmt
DrewJensen wrote:... phrases is not supported by the phpBB .

Negation is a feature that is just busted....
I tried the suggestion to search from off site using Google. It was able to parse the data for phrases and for negation (I didn't try anything too exotic).

I put that as an edit in my OP up top.

Re: Search for Phrase

Posted: Sat Apr 19, 2008 3:48 am
by DrewJensen
There is another option for searching and that would be to switch to a search engine that is separate from phpBB, but that can be integrated into pbpBB via a php API.

There is such a beast and it is named Sphinx

This search engine is currently used at the phpBB main website and is planned as an option for the phpBB 3.2 release, but is available as a pluggin for the 3.0.1 release now. ( NOTE: this board is currently running 3.0.0 and will be updated to 3.0.1 soon - currently I have 3.0.1 RC1 on my test machine here at my location with no apparent problems - 3.0.1 stable was released on the 7th and I will try to upgrade the system here this weekend and run with a full dump of the data from the production server as an acceptance test. )

With the sphinx full text search engine we would have the following capabilities:
4.2. Boolean query syntax
Boolean queries allow the following special operators to be used:

* explicit operator AND:
hello & world

* operator OR:
hello | world

* operator NOT:
hello -world
hello !world

* grouping:
( hello world )

Here's an example query which uses all these operators:

Example 5. Boolean query example

( cat -dog ) | ( cat -mouse)

There always is implicit AND operator, so "hello world" query actually means "hello & world".

OR operator precedence is higher than AND, so "looking for cat | dog | mouse" means "looking for ( cat | dog | mouse )" and not "(looking for cat) | dog | mouse".

Queries like "-dog", which implicitly include all documents from the collection, can not be evaluated. This is both for technical and performance reasons. Technically, Sphinx does not always keep a list of all IDs. Performance-wise, when the collection is huge (ie. 10-100M documents), evaluating such queries could take very long.
4.3. Extended query syntax

The following special operators can be used when using the extended matching mode:

* operator OR:
hello | world

* operator NOT:
hello -world
hello !world

* field search operator:
@title hello @body world

* phrase search operator:
"hello world"

* proximity search operator:
"hello world"~10

* quorum matching operator:
"the world is a wonderful place"/3
In addition it supports weighting, under a number of contexts one of which is by field. This means that a match of all terms in a topic title could be weighted greater then a match in the body of the topic only. ( just used as an example, haven't really thought that through about being a good thing )
 Edit: Another feature is the ability to timestamp search indexes, so that one can also weight results by age, so that items from today are weighted higher then items from 3 months ago. A feature discussed in the documentation specifically for indexing collections of small documents, such as forums or web blogs. 
As to the phpBB pluggin, I have not actually looked it over as of yet. The developer, one of the phpBB core develoeprs, is calling this a beta release and one issue that is open is support for the latest stable release of sphinx. Currently the pluggin only supports the last release, with support for the latest release to be coming soon.

Now, I have mentioned the idea of 'buy vs build' a couple of times regarding our using 'pluggins' or 'mods' versus building our own code. This might be a case of putting the two together - getting in on the early stages of the development work might be something worth doing - meaning that we would be building also. In reading over the thread about the pluggin release it sure sounds as if the developer would be willing to work with folks from the community.

Re: Search for Phrase

Posted: Sat Apr 19, 2008 3:43 pm
by acknak
I wonder if the boolean search features will be that useful. I don't think I've used anything beyond simple space-separated words, or "quoted phrases", more than a half-dozen times with Google--ever, because it's very good at up-ranking hits that are more relevant.

If they try the search feature at all, most people will just type in one or two words. If that produces a few relevant results, rather than none, or 250, then people will see that it saves them time and they'll be more likely to use it in the future.

Is it possible/easy to set up a test-bed, even one that may not be completely integrated with the rest of the test forum? I think the right test would be to try it against our data, and see if it gives relevant results with simple searches.

Re: Search for Phrase

Posted: Sat Apr 19, 2008 4:07 pm
by DrewJensen
yes a test bed would be a requirement.

Boolean search. Possible doesn't mean you have to use it..

But what we really need to do, I think, at this point, is to look at an overall picture of what a decent search function would be for a support form. It isn't just the search engine - there have been other topics here of late
- saving the search criteria ( or not losing it might be more apt )
- making the results list default to opening new _blank pages when clicked rather then replacing the current page contents
to name two.

Then there is the issue of tagging topics. In the limited case tagging a topic SOLVED, but the solution I have been looking at could go further and allow the topic to be tagged as SUPPORT, ANNOUNCEMENT, SOLVED, TUTORIAL...etc not just SOLVED. These would be predefined tags available to the original poster at the time and to moderators. Would it make sense to allow the search function to be limited to a tag or set of tags? Would it make sense to use the tags in a weighting fashion to move things like TUTORIAL and SOLVED to the top of the list?

Add to this a more general categorizing function and one could envision an even richer set of possibilities.

Then there is the option to rate topics. Currently we don't support any rating system, but we could. Even if we did not offer ratings for everything on the site, it might make sense to offer ratings for the tutorial section. If we did that would this again make sense as a weighting factor in a search result - higher rated topics to the top?

There is the question of 'expansiveness', meaning that the forum is only one place to find an answer. There is the wiki for instance. A google search would offer one advantage and that is that we could expand it to include the wiki along with the forum. Of course google ( any seo ) searches have down sides also - impedance is one. The document doesn't exist until the google bot has crawled the site and processed the results into the search indexes, a native search engine would be updated as soon as a topic is posted. Google searches have no way to know that a topic listed in the search index has been moved or deleted, native searches do.

So it seems to me that we need to spend some time really looking at this as an overall approach.