[postgis-users] updating points based on the poly they lie within

P Kishor punkish at eidesis.org
Tue Jul 25 08:16:59 PDT 2006


paraphrasing my task --
>> I have two tables, one of points and the other of polys.
>> I want to update the column points.col1 = polys.col1
>> for the poly each point falls within.

Update:

So, I tried some test data -- small number of points (about 70k)
against the entire set of polys (>200k). Here is my query --

UPDATE points
SET col1 = (
  SELECT b_col1
  FROM (

    -- point-in-poly overlay
    SELECT a.gid AS a_gid, b.col1 AS b_col1
    FROM points a, polys b
    WHERE (a.the_geom && b.the_geom) AND WITHIN(a.the_geom, b.the_geom)

  ) AS foo
  WHERE points.gid = foo.a_gid
)

Query returned successfully: 69692 rows affected, 146701 ms execution time.

Assuming O(N) performance, 5.25 million points should work out to
11051200 ms, or just over 3 hours. Not bad at all, considering a
similar point-in-poly procedure in the leading GIS software takes
about 4 days.

But, before I venture further, you PostGIS/SQL mavens, can you tell me
please if what I am doing above is correct? Just want to goofproof my
process here... still very, very new to Post(gres|GIS)

Gracias in advance.

On 7/25/06, P Kishor <punkish at eidesis.org> wrote:
> On 7/25/06, Gregory S. Williamson <gsw at globexplorer.com> wrote:
> > It does indeed look like you are in the right ballpark at least.
>
> Thanks Greg. Close, but no...
>
> >
> > You want to make sure you have a spatial (GIST) index on the relevant
> > columns, both polygon and point.
>
> If by "relevant columns" you mean the_geom for both, then yes, I
> created indexes on those via
>
> CREATE INDEX idx_states_the_geom ON states USING gist(the_geom);
> CREATE INDEX idx_cities_the_geom ON cities USING gist(the_geom);
>
> but...
>
> > In earlier versions of postgres/GIS you had to run an explicit command
> > [SELECT * FROM update_geometry_stats('table name', 'column name');
> > in newer (postgres 8.x I think, and postGIS > 1. something) the stats are
> > updated with the regular "ANALYZE tablename;" command.
> >
>
> ... it seems you mean something else when you say "spatial index." I
> will read up on this.
>
>
> > The bounding box operator (&&) in your queries hopes to get the planner
> > to use the indexes. It may not use them anyway if you are trying to access
> > a significant number of rows of the relevant tables (since 100,000 indexed
> > reads are more disk i/o than a sequential scan of the table, if 100,000 is
> > more than a tiny percent of your data).
>
> 100k is a very small % of my point data. I am talking about 5.25
> million points against 200,000 polys. All points are guaranteed to lie
> within a poly, but all polys may not have a point (for example, all
> cities belong to a state, but all states may not have a city). Of
> course, as I said earlier, points and polys are in different tables. I
> tried a few variations on the the UPDATE command, but failed. Here is
> one I tried, and the error I got --
>
> UPDATE cities
> SET cities.state_id = states.state_id
> FROM cities, states
> WHERE cities.the_geom && states.the_geom AND
>   WITHIN(cities.the_geom, states.the_geom)
>
>
> ERROR:  current transaction is aborted, commands ignored until end of
> transaction block
>
>
> >
> > I grabbed this example from a some code in a plpgsql function:
> > UPDATE raw_log_data
> > SET fips = p_fips
> > WHERE (req_pnts && sp_county_geom)
> > AND (distance(req_pnts, sp_county_geom) <= 0);
> >
> > The req_pnts are a column with point data and sp_county_geom is the area
> > of interest. In this case we iterate through our areas of interest -- that turned
> > out to be faster than trying to do all of them at once (think about 8000
> > polygons and some log files with a few million points)
> >
> > The && does make use of indexes and the distance () test seems to be a bit
> > faster than other methods of making sure a point is on or in a polygon.
>
>
> I will try out the above variation. Any speed bump would be nice. Just
> doing a plain vanilla GIS point-in-poly overlay takes about 4 days to
> finish the process, so cutting that down with the help of PostGIS
> would be really nice.
>
>
> >
> >
> > -----Original Message-----
> > From:   postgis-users-bounces at postgis.refractions.net on behalf of P Kishor
> > Sent:   Mon 7/24/2006 10:08 PM
> > To:     Postgis
> > Cc:
> > Subject:        [postgis-users] updating points based on the poly they lie within
> >
> > I have been tinkering with PostGIS for the past few hours, and am
> > simply amazed by its capabilities (yes, I've been living under a
> > rock).
> >
> > I have two tables, one of points (say, cities), and the other of polys
> > (say, states). I want to update the column cities.state_id =
> > states.state_id for the state it falls within. A little bit of
> > snooping around, and I got the following query working
> >
> > SELECT c.gid, s.state_id
> > FROM cities c, states s
> > WHERE c.the_geom && s.the_geom
> >  AND WITHIN(c.the_geom, s.the_geom)
> >
> > which finds for me the state_ids for each city. A few newbie questions --
> >
> > 1. am I doing the right thing above?
> > 2. what would the syntax be for updating c.state_id with the state_id
> > of the state it falls within? I am thinking something like
> >
> > UPDATE cities
> > SET state_id = (
> >   SELECT s.state_id
> >   FROM cities c, states s
> >   WHERE c.the_geom && s.the_geom AND WITHIN(c.the_geom, s.the_geom)
> > )
> > WHERE EXISTS (
> >   SELECT s.state_id
> >   FROM cities c, states s
> >   WHERE c.the_geom && s.the_geom AND WITHIN(c.the_geom, s.the_geom)
> > );
> >
> > Am I out of whack here?
> >
> > 3. Are indexes on the_geom enough to speed this up?
> >
> > I have been reading the following point-in-poly thread
> > <http://postgis.refractions.net/pipermail/postgis-users/2006-March/011678.html>,
> > and it was not very encouraging speed-wise. However, I haven't
> > experienced major slow down yet. Besides, this calculation is to be
> > performed very infrequently, so speed is important but not super
> > important.
> >
> > Many thanks in advance for all suggestions.
> >
> > --
> > Puneet Kishor http://punkish.eidesis.org/
> > Nelson Inst. for Env. Studies, UW-Madison http://www.ies.wisc.edu/
> > Open Source Geospatial Foundation https://edu.osgeo.org/
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> > !DSPAM:44c5a74a216985856016513!
> >
> >
> >
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
>
>
> --
> Puneet Kishor http://punkish.eidesis.org/
> Nelson Inst. for Env. Studies, UW-Madison http://www.ies.wisc.edu/
> Open Source Geospatial Foundation https://edu.osgeo.org/
>


-- 
Puneet Kishor http://punkish.eidesis.org/
Nelson Inst. for Env. Studies, UW-Madison http://www.ies.wisc.edu/
Open Source Geospatial Foundation https://edu.osgeo.org/



More information about the postgis-users mailing list