[postgis-users] Run on database or not?

Paul Ramsey pramsey at cleverelephant.ca
Tue Nov 24 09:22:48 PST 2009


Martin++. If you don't mind the complexity of writing your own little
query engine, having all the features in memory, as JTS
PreparedGeometries, with an STRTree on top of them (assuming your
collection is more than a few items) you'll get better performance
than in PostGIS. PostGIS has to prepare geometries one per query, the
custom app can have them prepared once at the start and used by
queries ever after.

P.

On Tue, Nov 24, 2009 at 8:59 AM, Martin Davis <mbdavis at refractions.net> wrote:
> Actually the JTS Java library (which PostGIS uses the C port of) is
> generally quite a bit faster than the same routine inside PostGIS.  The
> "maths" is not the hot spot in the geometric routines.
>
> There are reasons why you might not want to load all data into client-side
> memory (eg code complexity and data volume) but performance is *not* one of
> them.
>
> The best caching that you can do for this case is to use a PreparedGeometry
> for your target "large data".  PostGIS does some of this already, although
> it may or may not be able to be applied in your queries.  If you implement
> this client-side in Java you can easily construct your code so that all
> tests are carried out against PreparedGeometrys.
>
>
> Brian Modra wrote:
>>
>> 2009/11/24 tommy408 <tommytomorow at msn.com>:
>>
>>>
>>> I have a web application that requires a lot of queries, like 200 queries
>>> a
>>> second.  And they're mostly running ST_Within on large data.  Should I
>>> just
>>> load the data into memory to do the calculations ( I'm using java)  or
>>> keep
>>> them in postgres and run the queries?  I don't know if database already
>>> doing caching.
>>>
>>
>> ST_within is a function, as far as I know, its not doing its own caching.
>> But the database does naturally cache. So if you do the same query
>> again, it won't have to do so much IO... and the gist index will
>> probably also be in cache.
>>
>> You'll have to benchmark it yourself, because it depends a lot on your
>> data and query.
>> But caching and doing the calculations in memory using Java will
>> likely be slow anyway if the data is large, because Java is not great
>> for maths. Rather cache one complete result with its query data, and
>> then if a matching query comes in, you have the results already. But
>> let the database find the result in the first place rather than do the
>> maths in Java.
>>
>>
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Run-on-database-or-not--tp26494921p26494921.html
>>> Sent from the PostGIS - User mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>>>
>>
>>
>>
>>
>
> --
> Martin Davis
> Senior Technical Architect
> Refractions Research, Inc.
> (250) 383-3022
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>



More information about the postgis-users mailing list