[postgis-users] Efficient within / intersection of large polygons

Ralf Suhr Ralf.Suhr at itc-halle.de
Fri Jul 30 07:53:16 PDT 2010


if you use the vmap0 dataset you can start with the following

SELECT CASE WHEN abs(carea - ST_Area(Input_Poly)) < 10 THEN
	'is in one country'
	ELSE
	'crosses countrys'
	END AS hint,
	ST_Area(Input_Poly) AS darea,  carea, country_code
FROM
  (
	Select sum(ST_Area(ST_Intersection(Input_Poly, country_geom))) AS carea, 		
		country_code
	FROM country_table
	GROUP BY country_code
  ) AS sub;


Am Freitag 30 Juli 2010, 16:15:32 schrieb Christopher Mutel:
> Ralf-
> 
> I think what you are suggesting is a version of option 4: Storing some
> pre-calculated structure which is already present. Currently the
> database has no concept of country/region/etc. This is certainly a
> possibility, especially as there are a few relatively standardized
> levels, i.e. Continent, region, country, sub-country. It would involve
> several trips between the application code and the database, or
> hard-coding some row structure into a pl/pgsql function.
> 
> I wonder if there are other suggestions or comments?
> 
> On 30 July 2010 13:59, Ralf Suhr <Ralf.Suhr at itc-halle.de> wrote:
> > Hi Chris,
> > 
> > its look like you query all at ones cantons and stats. That does not make
> > sense. You can query one polygon with all districts from the world and
> > secound group by a attribut to get the country(s).
> > 
> > If you want to know if a polygon is in a country you can build sum of
> > area intersections from districts. The result have to be equal to the
> > polygon aerea.
> > 
> > Gr
> > Ralf
> > 
> > Am Freitag 30 Juli 2010, 13:41:55 schrieb Christopher Mutel:
> >> Hello all:
> >> 
> >> I am humbly asking for help in dealing with slow st_within &
> >> st_intersection queries with large (multi)polygons.
> >> 
> >> In my use case, I need to allow users to enter geographic areas (mosty
> >> polygons) through a web interface, and then determine if these areas
> >> lie completely with various continent or regional shapes (e.g. Europe,
> >> South Asia). The database also includes all countries in the world, as
> >> well as stats/oblasts/cantons/etc. for large countries. As the
> >> precision in these areas is relatively high (Europe is a multipolygon
> >> defined by ~1e6 points), this takes quite a long time.
> >> 
> >> The Postgis manual says: "Just because you *can* store all of Europe
> >> in one polygon doesn't mean you *should*." However, in my case I do
> >> need to store the entire polygon, as all polygons defined in the
> >> system will be regularly exported.
> >> 
> >> In my case, simplifying polygons is not really an option, as user
> >> inputs can be points (e.g. factories), and it is quite important to
> >> know whether they are on the Rhein in Germany or in France, for
> >> example. This affects the electricity mix they use, the markets they
> >> buy their inputs from, etc.
> >> 
> >> Looking through the mailing list & the web, it seems like there are
> >> several ways to improve database performance:
> >> 
> >> 1) Performing queries initially against a simplified set of polygons.
> >> However, I am not sure how this is faster than using the bounding box
> >> that is already indexed, and queries would still be made against the
> >> most complex shapes, as they are quite large.
> >> 
> >> 2) "Denormalizing" data, i.e. splitting polygons up into each separate
> >> UTM zone, and then running queries against input polygons which are
> >> similarly separated. This is recommended by the user manual, but I am
> >> not sure if it makes sense in my case. Is there a working code sample
> >> somewhere that does these calculations automatically? Is the easiest
> >> way simply to create polygons that define each UTM zone, and then do
> >> an intersection query for each one?
> >> 
> >> 3) Adding extra indices. I am not sure what else could be indexed, but
> >> maybe there is something that I am not seeing. The only queries I need
> >> are within and intersects/intersection.
> >> 
> >> 4) Caching results. As the base data set does not change often, and
> >> the number of total shapes is relatively small (<1000 even after user
> >> input), calculations could be done once, slowly, and then referenced,
> >> with the appropriate triggers to flush the cache upon updates or
> >> deletes.
> >> 
> >> Any ideas, comments, or questions would be greatly appreciated.
> >> 
> >> Yours,
> > 
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list