[postgis-users] Search Engine

Simon Greener simon at spatialdbadvisor.com
Thu Aug 5 15:27:18 PDT 2010


Uli,

> we should not drift too far into a discussion about how you would do it
> with SQL Server. This was not Ricardos question. But I admit I have to
> specify some of my comments to get clearer.

That's true. My main point is not that SQL Server is good (far from it), is that one
should try not to modify database structures just for something like text searching.
Since the original posting gave an indication that this was about indexing Just A Bunch
Of Tables (JBOT), he can do what you like, but if someone else was listening and they
can't modify designs (a production application).....

>> decision but it is foolish to suggest that "you should not do anything like
>> search tables one after another". Hell, I did just this in SQL Server
>
> The point ist not only speed. The main reason is that you loose the
> ability of PostgreSQL's full text search capabilitites to rank your
> results. With separate tables you have to pick the n best matches of
> each table and combine them afterwards to display the overall n best
> matches.

So, if I am looking for lapsed dog licenses I have to have let PostgreSQL
rank a search against a license table against one against garbage pickup times?
Perhaps one needs more business/functional information than just "index and
search a bunch of tables". Perhaps global ranking isn't needed. If it is,
for all tables, then, perhaps, PostgreSQL's implementation might need changing.

Also, and more importantly, Ricardo never mentioned the ranking of cross-table
search results as required functionality.

> Perhaps another way would be to join the tables in a view and do the
> search on the view. But then you can't fully benefit from the index.

Didn't Leo/Regina suggest functional indexes? Will they work?

> I don't know if SQL Server does not face these limitations? Anyway, "<
> 10 seconds" is not really really fast - at least for a PostgreSQL
> database ;-)

Now if you are on the fastest DB around (PostgreSQL) why would you say
(in a previous email):

> You should not do anything like search
> the tables one after the other and combine the results.

;-)

It is actually less than 5 seconds but I can't say exactly as I am not aware of
the numbers in the production environment (so I pushed it up). But it is
fronted by a client UI and I am told the customers are not complaining about
waiting for a spatial/textual search against up to 800 tables. And you know
what customers are like!

> I didn't mean anything technical but the problem, how to produce the
> most meaningful results and ranking of the results (how to weight values
> out of different fields, which words are relevant or only stop words...).

The more important issue for comparison purposes is that there is no cross-800 table ranking
(though in-table ranking, stop words etc can be done) and there is no need to modify database design.

I will say no more about SQL Server 2008 as you are right it is not relevant.

regards
Simon
-- 
SpatialDB Advice and Design, Solutions Architecture and Programming,
Oracle Database 10g Administrator Certified Associate; Oracle Database 10g SQL Certified Professional
Oracle Spatial, SQL Server, PostGIS, MySQL, ArcSDE, Manifold GIS, FME, Radius Topology and Studio Specialist.
39 Cliff View Drive, Allens Rivulet, 7150, Tasmania, Australia.
Website: www.spatialdbadvisor.com
   Email: simon at spatialdbadvisor.com
   Voice: +61 362 396397
Mobile: +61 418 396391
Skype: sggreener
Longitude: 147.20515 (147° 12' 18" E)
Latitude: -43.01530 (43° 00' 55" S)
GeoHash: r22em9r98wg
NAC:W80CK 7SWP3



More information about the postgis-users mailing list