[postgis-users] "Erase" and "Intersect" performance questions - PostGIS way slower than ArcGIS
Martin Davis
mtnclimb at telus.net
Thu Aug 25 21:32:33 PDT 2011
For your query #1, it looks like you are computing the ST_Intersection
twice. Does Postgres optimize this away? IF not, you might want to use
a subquery to avoid this expensive second computation.
I also agree with Chris, that query #2 is probably not doing what you
want it to. What you need to do is for each parcel, subtract the union
of the water features covered by it. This is still likely to be slow,
however.
A general comment is that ArcGIS is using a very different approach to
compute erase and intersect (aka overlay). It evaluates the entire set
of geometries together, rather than piece-wise like the SQL query is
doing. This generally results in much better performance for large
datasets, since there is less I/O and more efficient algorithms available.
By it's nature, using SQL for spatial computation is most efficient for
operations which can be carried out in a feature-wise manner.
Unfortunately, overlay does not fall into this category (since there is
a large amount of interaction between features.
Implementing a more efficient overlay algorithm in PostGIS is a nice
challenge for the future...
On 8/24/2011 8:18 PM, Sheara Cohen wrote:
>
> Hi all --
>
> I have what is likely to sound like the newbie question it is. I am in
> the process of shifting some of our modeling workload from ArcGIS to
> PostGIS. While PostGIS seems much faster for most non-spatial
> operations, I'm finding the exact opposite for spatial operations like
> "erase," "intersect," etc. And I'm sure there is some basic thing I
> just don't know about how to write these scripts to get fast performance.
>
> Below are details for two issues I have run into.
>
> 1."Intersect": In ArcGIS, I used the intersect tool to return a
> polygon file from the intersection of two different polygon files.
> They were large input files -- one the size of the state of California
> and one the size of a county in California, both with between 200-300
> thousand records. In ArcGIS, this took 43 minutes. In PostGIS, I used
> the script below, and it took over 17 hours.
>
> CREATE TABLE public.fresno_parcels_lt_intersect as
>
> SELECT
>
> ST_Intersection(p.wkb_geometry,
> lt.wkb_geometry) as wkb_geometry,
>
> id_parcel,
>
>
> (st_area(ST_Intersection(p.wkb_geometry, lt.wkb_geometry))) *
> 0.000247105381 as acres_lt_parcel,
>
> Landtype
>
> FROM fresno_parcels_unique_id as p, ca_landtypes_010211 as lt
>
> WHERE ST_Intersects(p.wkb_geometry, lt.wkb_geometry);
>
> 2."Erase": In ArcGIS, I used the erase tool to remove water features
> (polygons) from a county parcel file. Both files were large. The water
> features covered the state of California with 100K records, and the
> parcel file had almost 300K records. In ArcGIS, this took 17 minutes.
> In PostGIS, I had to cancel the run after 16 hours. I used the script
> below.
>
> CREATE TABLE fresno_parcels_minus_ca_water as
>
> SELECT ST_GeomFromWKB (ST_Difference (wkb_geometry
> (ca_water_final_082211), wkb_geometry (fresno_parcels_clean)))
>
> FROM ca_water_final_082211, fresno_parcels_clean;
>
> I added a spatial index to all of the input files in PostGIS (CREATE
> INDEX ____ ON ____ USING gist (wkb_geometry)). Do any of you all have
> suggestions as to how to make these sorts of operations run more quickly?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20110825/9185287f/attachment.html>
More information about the postgis-users
mailing list