[postgis-users] Problem with ST_Within, Polygons and Multipolygons

Kevin Neufeld kneufeld at refractions.net
Thu May 7 09:58:29 PDT 2009


Hi Jorge,

So, I had a quick look at your data ... I'm sorry to say that you've got a lot of problems with your ocean polygon.
Paul can correct me since he did the work on isvalidreason, but I think st_isvalid only returns the first thing it finds 
wrong with your input geometry.

I did this and found all kinds of things:

-- Extract the POLYGONs from the MULTI
CREATE TABLE water_polys AS
SELECT (ST_Dump(the_geom)).geom AS the_geom
FROM water;

SELECT ST_IsValidReason(the_geom)
FROM water_polys
WHERE NOT ST_IsValid(the_geom);

So, looking at the individual polygons (of which only the first error is again reported), I found over 66 
self-intersections (figure 8's and so forth) and several whose boundaries loop back and touch itself.

Also, in response to your original question about why a point on your airport also intersects the ocean polygon: it 
turns out that the island in question is not a hole in your ocean at all ... but a second polygon on top of your ocean 
polygon.  This is why your point query returns true - there are two surfaces (the island and the water) that intersect 
the point (which, by the way, also makes an invalid multipolygon).

A while back I put together some isvalid/issimple geometry descriptions with soem pretty pictures:
http://postgis.refractions.net/documentation/manual-svn/ch04.html#OGC_Validity

You may want to have a look through it and clean up your dataset.  The invalidity will most certainly result in more 
unexpected behaviour.

Cheers,
Kevin


Jorge Arévalo wrote:
> Hello,
> 
> I have additional info. When running the method "isvalid", I get this 
> warning message:
> 
> NOTICE:  Self-intersection at or near point -2749.99 4.7955e+06
> 
> What is exactly a "self-intersection"? May this "self-intersection" be 
> the cause of my problem?
> 
> Thanks
> Jorge
> 
> 
> 
> 2009/5/6 Jorge Arévalo <jorge.arevalo at gmail.com 
> <mailto:jorge.arevalo at gmail.com>>
> 
>     Hi,
> 
>     Thanks for your response Kevin. I can provide some data:
> 
>     - Screenshot (over geoserver) of the "water" multipolygon. You can
>     see the shape of Spain and the Balearic Islands, and a hole on the
>     left (Portugal). The multipolygon is open, isn't it? -->
>     http://www.nebulared.com/tmp_geo/water_multipolygon.jpg
> 
>     - Zoom over Menorca (one of the Balearic Islands). As you can see,
>     the point tested is inside the "airports" multipolygon, but
>     ST_Within and ST_Contains returns that the point is inside both,
>     "water" and "airports" multipolygons -->
>     http://www.nebulared.com/tmp_geo/zoom_in_water_multipolygon.jpg
> 
>     - The "ST_AsText" data for the "airport" multipolygon -->
>     http://www.nebulared.com/tmp_geo/airports_multipolygon.txt
> 
>     - The "ST_AsText" data for the "water" multipolygon (rar file, 2.5
>     MB of plain text...) -->
>     http://www.nebulared.com/tmp_geo/water_multipolygon.rar
> 
>     So, how can I check the validity of my multipolygons? Do someone
>     could help me with this?
> 
>     Now, the query. My context is the next:
>     - I have a table with polygons that I want to check (to say if they
>     are in an airport, in a forest, in water, etc). We can call this
>     table T1. I use the centroid of the polygons to check, not the
>     entire polygon.
>     - In other different table, that we'll call T2, I have all the
>     multipolygons ("water", "airports", etc). The name of the field that
>     says if the multipolygon is "water" or "airport" or anything is
>     "clutter"
> 
>     With this code, I pretend to loop over T1, checking if the centroid
>     of its polygons are inside of one (and ONLY one) of the
>     multipolygons of T2.  (Is part of a procedure)
> 
>     (...)
>     BEGIN
>       FOR s IN SELECT * FROM T1 LOOP
>             select clutter from T2 where
>     ST_Within(ST_Centroid(T1.wkb_geometry), T2.wkb_geometry))
>             (...)
>       END LOOP;
>       RETURN;
>     END;
> 
>     Most of the times, the select inside the loop, returns more than one
>     result ("water" and other one, normally). That's my problem
> 
>     I have another version of the query, with Java. Again, I loop over
>     the table T1, and check the centroid of its polygons with all the
>     multipolygons (water, airports...) that are in T2. Then, my query,
>     inside the loop, is:
> 
>     SELECT clutter FROM T2 WHERE ST_Within(?, T2.wkb_geometry)
> 
>     The '?' is replaced with the centroid of the polygon in T1, of
>     course. Basically, is the same idea of the previous SQL code, but
>     with Java.
> 
>     Apart from this, other problem is that ST_Within takes a long time.
>     1 sec per query. And I have like 2 millions of polygons (queries) to
>     check...
> 
>     My multipolygons' table (T2) has only 29 rows (small table with
>     large geometries), and I found this
>     http://postgis.refractions.net/documentation/manual-1.3/ch05.html.
>     Seems to be good for me. But using "SET enable_seqscan TO off", the
>     queries take the same time (~ 1 sec). Maybe the other approach
>     (create an additional column that "caches" the bbox, and matching
>     against this) could help me. Any ideas on how to create this bbox
>     with my huge multipolygons?
> 
> 
>     Thanks in advance
> 
>     Regards
>     Jorge
> 
> 
>     On Wed, May 6, 2009 at 4:28 AM, Kevin Neufeld
>     <kneufeld at refractions.net <mailto:kneufeld at refractions.net>> wrote:
> 
>         Jorge Arévalo wrote:
> 
>             ... If is useful, I tested the method "ST_isvalid" with the
>             multipolygon and returns "false". Maybe the multipolygon is
>             not closed? I loaded the data from a shapefile. Is it
>             possible to create a "non-valid" multipolygon? Does PostGIS
>             accept this?
> 
> 
>         Ah, yes.  Most spatial predicates in PostGIS assume the input
>         geometry is valid.  Suppose you had defined a POLYGON with a
>         hole or inner ring outside of the exterior ring.  What is the
>         area?  Does the question even make sense?  PostGIS allows
>         invalid geometries in the database so users can make full use of
>         the PostGIS toolset to do whatever they need (ie. breakdown a
>         polygon to it's constituent linework and rebuild it back up
>         again to a valid polygon)
> 
> 
>             Ok, being even more specific. I'm working with data about
>             Spain. I have a HUGE multipolygon that represents "water"
>             (this is, the coasts around Spain and its islands). Then,
>             the "holes" inside this multypolygon have the shape of
>             Spain, Balearic Islands and Canary Islands. Of course, I
>             have more multipolygons that represent "forests",
>             "airports", "cities", etc, that fit into these holes.
> 
>             Really, my problem is with some points that belong to an
>             airport in an island. Using "ST_Within" and "ST_Contains",
>             the result is that these points belong to the multipolygon
>             "airport" and multipolygon "water" at same time. Obviously,
>             the island (and its airport) is surrounded by water, but the
>             airport's points shouldn't be part of the multopolygon
>             "water". And, as I said, when I apply "ST_isvalid" to the
>             multipolygon "water", returns false. Maybe is not closed?
> 
> 
>         Yeah, as mentioned before, a point that is not on the surface of
>         a (multi)polygon (whether in a hole or completely outside) is
>         not considered within the (multi)polygon.  It does sound like
>         your ocean polygon has validity issues.
> 
> 
> 
>             Oh, btw, what's the difference between "ST_Within" and
>             "Within". Does "ST_Within" use index instead of geometry? Am
>             I right?
> 
>         No, not instead of.  Both use the actual geometry for testing
>         within.  ST_Within will also use the index to narrow down the
>         candidate list first.
>         As you can see, the definition of ST_Within is just a simple SQL
>         wrapper that first invokes the index.
>         postgis=# select prosrc from pg_proc where proname = 'st_within';
>                       prosrc              
>          ---------------------------------------
>         SELECT $1 && $2 AND _ST_Within($1,$2)
>         (1 row)
> 
>         http://postgis.refractions.net/documentation/manual-svn/ST_Within.html
> 
>         Cheers,
>         Kevin
> 
> 
>         _______________________________________________
>         postgis-users mailing list
>         postgis-users at postgis.refractions.net
>         <mailto:postgis-users at postgis.refractions.net>
>         http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list