[postgis-users] Problem with ST_Within, Polygons and Multipolygons
pcreso at pcreso.com
pcreso at pcreso.com
Thu May 7 15:39:36 PDT 2009
HIi
I have tried a few ways of dealing with this, including ST_Simplify_PreserveTopology() to try to fix errors.
I have yet to find a solution that automatically resolves such probles as I think they should be. Using some variation of buffer(buffer(geom,-1),1) instead of buffer(geom,0) sometimes fixes things that the simple buffer of zero has failed to.
What I have done that works for me is to use QGIS to edit the geometries that are not fixed by the above operations.
I run isvalid() limit 1, then zoom into the returned coodinates & manually fix the problems, which so far have always been pretty simple & obvious. I then rerun isvalid() & work through each error, It doesn't take long per error, 200 or so can be fixed in a morning without too much trouble.
Also note that an empty geometry is regarded as an error by some applications, but passes isvalid(), so I also do a check for ST_NumGeomteries()=0 to clean up PostGIS datasets.
HTH,
Brent Wood
--- On Fri, 5/8/09, Jorge Arévalo <jorge.arevalo at gmail.com> wrote:
> From: Jorge Arévalo <jorge.arevalo at gmail.com>
> Subject: Re: [postgis-users] Problem with ST_Within, Polygons and Multipolygons
> To: "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
> Date: Friday, May 8, 2009, 9:44 AM
> Hello,
>
> Many thanks for your help Kevin. I can confim one thing:
> isvalid only returns the first error. Just today, I realized
> on the big amount of errors of my data, as you said. Indeed,
> I was reading the same link that you provided me about
> geometry validity, and I split my multipolygon into single
> polygons to check them. My conclusions are the same as
> yours: what a mess!
>
>
>
> Oh, and thanks again for your response to my original
> question. It helped me a lot... Maybe is it time to ask for
> new data, without errors? If not, how should I
> "clean" my data? I tried with ST_Buffer(geom, 0.0)
> (I read in this link http://www.bostongis.com/postgis_extent_expand_buffer_distance.snippet
> that is useful to repair data with self-interesections), but
> doesn't seem to work for me...
>
>
>
> Regards
> Jorge
>
>
>
> On Thu, May 7, 2009 at 6:58 PM,
> Kevin Neufeld <kneufeld at refractions.net>
> wrote:
>
>
> Hi Jorge,
>
>
>
> So, I had a quick look at your data ... I'm sorry to
> say that you've got a lot of problems with your ocean
> polygon.
>
> Paul can correct me since he did the work on isvalidreason,
> but I think st_isvalid only returns the first thing it finds
> wrong with your input geometry.
>
>
>
> I did this and found all kinds of things:
>
>
>
> -- Extract the POLYGONs from the MULTI
>
> CREATE TABLE water_polys AS
>
> SELECT (ST_Dump(the_geom)).geom AS the_geom
>
> FROM water;
>
>
>
> SELECT ST_IsValidReason(the_geom)
>
> FROM water_polys
>
> WHERE NOT ST_IsValid(the_geom);
>
>
>
> So, looking at the individual polygons (of which only the
> first error is again reported), I found over 66
> self-intersections (figure 8's and so forth) and several
> whose boundaries loop back and touch itself.
>
>
>
> Also, in response to your original question about why a
> point on your airport also intersects the ocean polygon: it
> turns out that the island in question is not a hole in your
> ocean at all ... but a second polygon on top of your ocean
> polygon. This is why your point query returns true - there
> are two surfaces (the island and the water) that intersect
> the point (which, by the way, also makes an invalid
> multipolygon).
>
>
>
>
>
> A while back I put together some isvalid/issimple geometry
> descriptions with soem pretty pictures:
>
> http://postgis.refractions.net/documentation/manual-svn/ch04.html#OGC_Validity
>
>
>
> You may want to have a look through it and clean up your
> dataset. The invalidity will most certainly result in more
> unexpected behaviour.
>
>
>
> Cheers,
>
> Kevin
>
>
>
>
>
> Jorge Arévalo wrote:
>
>
> Hello,
>
>
>
> I have additional info. When running the method
> "isvalid", I get this warning message:
>
>
>
> NOTICE: Self-intersection at or near point -2749.99
> 4.7955e+06
>
>
>
> What is exactly a "self-intersection"? May this
> "self-intersection" be the cause of my problem?
>
>
>
> Thanks
>
> Jorge
>
>
>
>
>
>
>
> 2009/5/6 Jorge Arévalo <jorge.arevalo at gmail.com
> <mailto:jorge.arevalo at gmail.com>>
>
>
>
>
>
> Hi,
>
>
>
> Thanks for your response Kevin. I can provide some
> data:
>
>
>
> - Screenshot (over geoserver) of the
> "water" multipolygon. You can
>
> see the shape of Spain and the Balearic Islands, and
> a hole on the
>
> left (Portugal). The multipolygon is open, isn't
> it? -->
>
> http://www.nebulared.com/tmp_geo/water_multipolygon.jpg
>
>
>
> - Zoom over Menorca (one of the Balearic Islands). As
> you can see,
>
> the point tested is inside the "airports"
> multipolygon, but
>
> ST_Within and ST_Contains returns that the point is
> inside both,
>
> "water" and "airports"
> multipolygons -->
>
> http://www.nebulared.com/tmp_geo/zoom_in_water_multipolygon.jpg
>
>
>
> - The "ST_AsText" data for the
> "airport" multipolygon -->
>
> http://www.nebulared.com/tmp_geo/airports_multipolygon.txt
>
>
>
> - The "ST_AsText" data for the
> "water" multipolygon (rar file, 2.5
>
> MB of plain text...) -->
>
> http://www.nebulared.com/tmp_geo/water_multipolygon.rar
>
>
>
> So, how can I check the validity of my multipolygons?
> Do someone
>
> could help me with this?
>
>
>
> Now, the query. My context is the next:
>
> - I have a table with polygons that I want to check
> (to say if they
>
> are in an airport, in a forest, in water, etc). We
> can call this
>
> table T1. I use the centroid of the polygons to
> check, not the
>
> entire polygon.
>
> - In other different table, that we'll call T2, I
> have all the
>
> multipolygons ("water",
> "airports", etc). The name of the field that
>
> says if the multipolygon is "water" or
> "airport" or anything is
>
> "clutter"
>
>
>
> With this code, I pretend to loop over T1, checking
> if the centroid
>
> of its polygons are inside of one (and ONLY one) of
> the
>
> multipolygons of T2. (Is part of a procedure)
>
>
>
> (...)
>
> BEGIN
>
> FOR s IN SELECT * FROM T1 LOOP
>
> select clutter from T2 where
>
> ST_Within(ST_Centroid(T1.wkb_geometry),
> T2.wkb_geometry))
>
> (...)
>
> END LOOP;
>
> RETURN;
>
> END;
>
>
>
> Most of the times, the select inside the loop,
> returns more than one
>
> result ("water" and other one, normally).
> That's my problem
>
>
>
> I have another version of the query, with Java.
> Again, I loop over
>
> the table T1, and check the centroid of its polygons
> with all the
>
> multipolygons (water, airports...) that are in T2.
> Then, my query,
>
> inside the loop, is:
>
>
>
> SELECT clutter FROM T2 WHERE ST_Within(?,
> T2.wkb_geometry)
>
>
>
> The '?' is replaced with the centroid of the
> polygon in T1, of
>
> course. Basically, is the same idea of the previous
> SQL code, but
>
> with Java.
>
>
>
> Apart from this, other problem is that ST_Within
> takes a long time.
>
> 1 sec per query. And I have like 2 millions of
> polygons (queries) to
>
> check...
>
>
>
> My multipolygons' table (T2) has only 29 rows
> (small table with
>
> large geometries), and I found this
>
> http://postgis.refractions.net/documentation/manual-1.3/ch05.html.
>
> Seems to be good for me. But using "SET
> enable_seqscan TO off", the
>
> queries take the same time (~ 1 sec). Maybe the other
> approach
>
> (create an additional column that "caches"
> the bbox, and matching
>
> against this) could help me. Any ideas on how to
> create this bbox
>
> with my huge multipolygons?
>
>
>
>
>
> Thanks in advance
>
>
>
> Regards
>
> Jorge
>
>
>
>
>
> On Wed, May 6, 2009 at 4:28 AM, Kevin Neufeld
>
> <kneufeld at refractions.net
> <mailto:kneufeld at refractions.net>>
> wrote:
>
>
>
> Jorge Arévalo wrote:
>
>
>
> ... If is useful, I tested the method
> "ST_isvalid" with the
>
> multipolygon and returns
> "false". Maybe the multipolygon is
>
> not closed? I loaded the data from a
> shapefile. Is it
>
> possible to create a
> "non-valid" multipolygon? Does PostGIS
>
> accept this?
>
>
>
>
>
> Ah, yes. Most spatial predicates in PostGIS
> assume the input
>
> geometry is valid. Suppose you had defined a
> POLYGON with a
>
> hole or inner ring outside of the exterior
> ring. What is the
>
> area? Does the question even make sense?
> PostGIS allows
>
> invalid geometries in the database so users can
> make full use of
>
> the PostGIS toolset to do whatever they need
> (ie. breakdown a
>
> polygon to it's constituent linework and
> rebuild it back up
>
> again to a valid polygon)
>
>
>
>
>
> Ok, being even more specific. I'm
> working with data about
>
> Spain. I have a HUGE multipolygon that
> represents "water"
>
> (this is, the coasts around Spain and its
> islands). Then,
>
> the "holes" inside this
> multypolygon have the shape of
>
> Spain, Balearic Islands and Canary
> Islands. Of course, I
>
> have more multipolygons that represent
> "forests",
>
> "airports", "cities",
> etc, that fit into these holes.
>
>
>
> Really, my problem is with some points
> that belong to an
>
> airport in an island. Using
> "ST_Within" and "ST_Contains",
>
> the result is that these points belong to
> the multipolygon
>
> "airport" and multipolygon
> "water" at same time. Obviously,
>
> the island (and its airport) is
> surrounded by water, but the
>
> airport's points shouldn't be
> part of the multopolygon
>
> "water". And, as I said, when I
> apply "ST_isvalid" to the
>
> multipolygon "water", returns
> false. Maybe is not closed?
>
>
>
>
>
> Yeah, as mentioned before, a point that is not
> on the surface of
>
> a (multi)polygon (whether in a hole or
> completely outside) is
>
> not considered within the (multi)polygon. It
> does sound like
>
> your ocean polygon has validity issues.
>
>
>
>
>
>
>
> Oh, btw, what's the difference
> between "ST_Within" and
>
> "Within". Does
> "ST_Within" use index instead of geometry? Am
>
> I right?
>
>
>
> No, not instead of. Both use the actual
> geometry for testing
>
> within. ST_Within will also use the index to
> narrow down the
>
> candidate list first.
>
> As you can see, the definition of ST_Within is
> just a simple SQL
>
> wrapper that first invokes the index.
>
> postgis=# select prosrc from pg_proc where
> proname = 'st_within';
>
> prosrc
> ---------------------------------------
>
> SELECT $1 && $2 AND _ST_Within($1,$2)
>
> (1 row)
>
>
>
> http://postgis.refractions.net/documentation/manual-svn/ST_Within.html
>
>
>
> Cheers,
>
> Kevin
>
>
>
>
>
>
> _______________________________________________
>
> postgis-users mailing list
>
> postgis-users at postgis.refractions.net
>
> <mailto:postgis-users at postgis.refractions.net>
>
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------
>
>
>
> _______________________________________________
>
> postgis-users mailing list
>
> postgis-users at postgis.refractions.net
>
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
>
> postgis-users mailing list
>
> postgis-users at postgis.refractions.net
>
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
>
>
> -----Inline Attachment Follows-----
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list