[postgis-users] Re: geos based spatial queries

Peter Aberline AberlineP at willis.com
Mon Nov 24 11:33:18 PST 2003


Hi all,

Thanks for your assistance with those GEOS based queries. As promised, here
are my timings. As it turns out, my testing using "distance" rather than
"within" made a huge different to execution time. In all cases there are
GIST indices on the geometry columns and btree's on the state columns. The
"Big State" geometry has 43815 vertices.

I'll post similar timings for a 15 million point dataset in a couple of
days. In regards to a posting I made a couple of weeks ago I've also had
success with using PostGIS from .NET via a J# recompilation of all the
required code . I'll make a separate posting with more details in a couple
of days.

I'm currently trying to do a selection using the Query 2 syntax from the
points table that fall within the US coastline, represented by a 950000
vertex polygon. The query is selecting from the same points table (1.6
million records). There are GIST indexes on both table's geometry cols, and
I'm expecting only a couple of hundred points to fall outside the border.
This query is performing very, very slowly - it's been going for over 48
hours now! Does anyone have any idea of why this is taking so long? Is
query time using "distance" a factor of both the area and number of
vertices in the containing polygon?

thanks,
Peter.


Query 1:
select count(*)
from points
where within(points.the_geom, (select the_geom
                                                             from states
                                                             where
state_name = 'Big State')) AND
             points.the_geom && (select the_geom
                                                       from states
                                                       where state_name
= 'Big State'

Result: 709251
PostgreSQL 7.3.4, Postgis 0.8, Cygwin, Windows 2000, 1.2Gb Ram, 1.7Ghz:  3
Hours 43minutes!


Query 2:
select count(*)
from points p
where distance(p.the_geom, (select the_geom
                                                          from usa_states
                                                          where state_name
= 'Big State'))  <= 0 AND
              aetnametrony.the_geom && (select the_geom

from usa_states

where state_name = 'Big State')

Result: 709251
PostgreSQL 7.3.4, Postgis 0.8, Cygwin, Windows 2000, 1.2Gb Ram, 1.7Ghz:
5min 42 sec
PostgreSQL 7.2.1 (Native Port), Windows NT 4,  256Mb Ram, 1Ghz: 24min, 17
sec
PostgreSQL 7.3.4, Postgis 0.8, Mandrake Linux 9.1, 256Mb Ram, 700Mhz
Athlon: 24min, 20 secs

Query 3
SELECT count(*)
FROM points p, states s
WHERE s.state_name = 'Big State' AND
                  p.the_geom && s.the_geom AND
                  distance(p.the_geom,s.the_geom)  <= 0;

Result: 709251
PostgreSQL 7.3.4, Postgis 0.8, Cygwin, Windows 2000, 1.2Gb Ram, 1.7Ghz:
5Mins, 28 sec
PostgreSQL 7.2.1 (Native Port), Windows NT 4,  256Mb Ram, 1Ghz:  27mins, 30
secs
PostgreSQL 7.3.4, Postgis 0.8, Mandrake Linux 9.1, 256Mb Ram, 700Mhz
Athlon: 16min, 12 secs


Query 4
SELECT count(*)
FROM points
WHERE the_geom && expand(GeometryFromText('POLYGON((someVal someVal,
someVal someVal,
someVal someVal,
someVal someVal))', 4269) ,  0.02) AND
                 distance(transform(the_geom, 26918),
transform(GeometryFromText('POINT(someVal someVal)', 4269),  26918)) <
1609.344

Result: 7228
PostgreSQL 7.3.4, Postgis 0.8, Cygwin, Windows 2000, 1.2Gb Ram, 1.7Ghz:
2mins 13 sec
PostgreSQL 7.2.1 (Native Port), Windows NT 4,  256Mb Ram, 1Ghz: Error:
Operation on two GEOMETRIES with different SRIDs
PostgreSQL 7.3.4, Postgis 0.8, Mandrake Linux 9.1, 256Mb Ram, 700Mhz
Athlon: 1min 55 sec




Quoting Chris Hodgson <chodgson at refractions.net>:
>I think I can speak for Refractions when I say, do tell!

>We're always interested in performance info. If it's slow, then we'll know
we
>need to make it faster; if its fast, then that's great. No harm in letting

>everone know which way is fast and which way is slow. There's no license
>attached to postgis disallowing you from publishing performance
comparisons
>without our explicit permission (as there is with Oracle...) so go right
ahead.
>No place would be better than here on the list.

>Please post the hardware you're running the tests on (architecture,
processor,
>memory) as well as some idea of the size of the dataset so we have a
baseline
>for the performance measurements.


Quoting Peter Aberline <AberlineP at willis.com>:

> Hi all,
>
> I've being doing some follow up work on my "GEOS based spatial queries"
> posting of a few days ago. I've done some timings on the various query
> techniques suggested and I thought they might be of interest to users.
I'm
> unsure if it's correct etiquette to post them to the list, though.
>
> Can I have some guidance from Refractions on this please?
>
> thanks,
> Peter.

_____________________________________________________________

The information in this email and in any attachments is confidential and
may be privileged.  If you are not the intended recipient, please destroy
this message, delete any copies held on your systems and notify the sender
immediately.  You should not retain, copy or use this email for any
purpose, nor disclose all or any part of its content to any other person.






More information about the postgis-users mailing list