There is another option for searching and that would be to switch to a search engine that is separate from phpBB, but that can be integrated into pbpBB via a php API.
There is such a beast and it is named
Sphinx
This search engine is currently used at the phpBB main website and is planned as an option for the phpBB 3.2 release, but is available as a pluggin for the 3.0.1 release now. ( NOTE: this board is currently running 3.0.0 and will be updated to 3.0.1 soon - currently I have 3.0.1 RC1 on my test machine here at my location with no apparent problems - 3.0.1 stable was released on the 7th and I will try to upgrade the system here this weekend and run with a full dump of the data from the production server as an acceptance test. )
With the sphinx full text search engine we would have the following capabilities:
4.2. Boolean query syntax
Boolean queries allow the following special operators to be used:
* explicit operator AND:
hello & world
* operator OR:
hello | world
* operator NOT:
hello -world
hello !world
* grouping:
( hello world )
Here's an example query which uses all these operators:
Example 5. Boolean query example
( cat -dog ) | ( cat -mouse)
There always is implicit AND operator, so "hello world" query actually means "hello & world".
OR operator precedence is higher than AND, so "looking for cat | dog | mouse" means "looking for ( cat | dog | mouse )" and not "(looking for cat) | dog | mouse".
Queries like "-dog", which implicitly include all documents from the collection, can not be evaluated. This is both for technical and performance reasons. Technically, Sphinx does not always keep a list of all IDs. Performance-wise, when the collection is huge (ie. 10-100M documents), evaluating such queries could take very long.
4.3. Extended query syntax
The following special operators can be used when using the extended matching mode:
* operator OR:
hello | world
* operator NOT:
hello -world
hello !world
* field search operator:
@title hello @body world
* phrase search operator:
"hello world"
* proximity search operator:
"hello world"~10
* quorum matching operator:
"the world is a wonderful place"/3
In addition it supports weighting, under a number of contexts one of which is by field. This means that a match of all terms in a topic title could be weighted greater then a match in the body of the topic only. ( just used as an example, haven't really thought that through about being a good thing )
Edit: Another feature is the ability to timestamp search indexes, so that one can also weight results by age, so that items from today are weighted higher then items from 3 months ago. A feature discussed in the documentation specifically for indexing collections of small documents, such as forums or web blogs. |
As to the phpBB pluggin, I have not actually looked it over as of yet. The developer, one of the phpBB core develoeprs, is calling this a beta release and one issue that is open is support for the latest stable release of sphinx. Currently the pluggin only supports the last release, with support for the latest release to be coming soon.
Now, I have mentioned the idea of 'buy vs build' a couple of times regarding our using 'pluggins' or 'mods' versus building our own code. This might be a case of putting the two together - getting in on the early stages of the development work might be something worth doing - meaning that we would be building also. In reading over the thread about the pluggin release it sure sounds as if the developer would be willing to work with folks from the community.