[postgis-users] Geoprocessing & BigData

Stephen Woodbridge woodbri at swoodbridge.com
Tue Jan 19 05:40:45 PST 2016


Also if your flood zones are large (as in many points to define 
boundaries) you can greatly speed up the processing by first chopping 
them into to tiles using a square grid and then doing your comparison.

This works because each tile is a square or part of a square and has a 
greatly reduced number on point in the boundary which speeds up the 
calculations. Many of the tiles will only be 5 nodes for the complete 
square. This also reduces the number of blocks that need to be compared 
to to the flood zone because it will have s smaller bounding box.

-Steve

On 1/19/2016 3:53 AM, Felix Kunde wrote:
> Hey Ravi
> Hm, your query doesn't sound like you need any parallelization method in
> the first place. You could use ST_Contains/ST_Within to get the census
> blocks fully covered by flood zones + ST_Intersects for blocks partly
> covered with centerpoint inside the flood zone. Should work well with
> your index. Sum up the area of covered blocks and devide through total
> area of blocks. No need for intersection and union at all...
>
> Btw: For ArcGIS-like union I found this workaround
> https://gist.github.com/mapbutcher/9358937
> Good luck
> Felix
> *Gesendet:* Montag, 18. Januar 2016 um 23:46 Uhr
> *Von:* "Ravi Pavuluri" <ravitheja at ymail.com>
> *An:* vincent.ml at oslandia.com, "PostGIS Users Discussion"
> <postgis-users at lists.osgeo.org>, "Rémi Cura" <remi.cura at gmail.com>
> *Betreff:* Re: [postgis-users] Geoprocessing & BigData
> Vincent and Remi,
>
> Thank you both for your inputs. I have combined two things in one
> thread. Parallelization is a secondary need and I will look into
> "Postgresql-Xc, Greenplum or custom code approach".
>
> Regarding the PostGIS performance on intersecting geometries, I am not
> able to see any improvement. I am looking at intersection because of my
> use case. (Ex: What % of census blocks fall in Zone A, Zone B, Zone C
> etc. flood zones from Flood Zones Layer). If intersect is to avoided,
> can this be achieved through another way?
>
>
> @Vincent : For ArcGIS Union, please see here.
> http://resources.esri.com/help/9.3/arcgisengine/java/gp_toolref/analysis_tools/union_analysis_.htm
>
>
> Any inputs are appreciated.
>
> Thanks again,
> Ravi.
>
> --------------------------------------------
> On Mon, 1/18/16, Rémi Cura <remi.cura at gmail.com> wrote:
>
> Subject: Re: [postgis-users] Geoprocessing & BigData
> To: vincent.ml at oslandia.com, "PostGIS Users Discussion"
> <postgis-users at lists.osgeo.org>
> Cc: "Ravi Pavuluri" <ravitheja at ymail.com>
> Date: Monday, January 18, 2016, 2:51 PM
>
> Hey,
> if you have one
> beefy server you can parallelize throwing several queries
> working on sub set of your data.
> (aka parallel
> processing trough data partition).
> One conceptual
> example : you want to process the world, you create 20
> workers, a list of countries, and then make the worker
> process the list country by country.
>
> If you think one
> postgres server will not be sufficient,
> you
> could of course shard your data across several servers,
> with options ranging from writting from scratch
> (you rewrite everything),
> to using existing
> open source code, to dedicated solution like
>   Postgresql-Xc, greenplum, ...
>
> However, sorry to
> say this but in your case it looks like your first
> improvement step will not come from massive paralleling but
> from first better understanding the world of geospatial data
> and postgis.
>
> Cheers,
> Rémi-C
>
> 2016-01-18 19:30 GMT+01:00
> Vincent Picavet (ml) <vincent.ml at oslandia.com>:
> Hi Ravi,
>
>
>
>
>
>
>
>
>
> On 18/01/2016 19:14, Ravi Pavuluri wrote:
>
>  > Hi All,
>
>  >
>
>  > I am checking if there is a way to process quickly
> large datasets such
>
>  > as census blocks in PostGIS and also by leveraging big
> data platform. I
>
>  > have few questions in this regard.
>
>  >
>
>  > 1) When I try intersect for sample census blocks with
> another polygon
>
>  > layer, PostGIS 2.2(on Postgres 9.4) takes ~60 minutes
> (after optimizing
>
>  > from http://postgis.net/2014/03/14/tip_intersection_faster/
> ) while on
>
>  > ESRI ArcMap takes ~10 minutes. PostGIS layers already
> have geospatial
>
>  > indices. Is there anyway to optimize this further?
>
>
>
> Following the links on your page, here is a good answer from
> Paul (TL;DR
>
> : st_intersection is slow, avoid it) :
>
> http://gis.stackexchange.com/questions/31310/acquiring-arcgis-like-speed-in-postgis/31562
>
>
>
>  > 2) What is an equivalent of ESRI Union in PostGIS? I
> didn't see any out
>
>  > of the box functions and any tips here are
> appreciated.
>
>
>
> If ESRI Union makes a union, maybe st_union ? But I guess
> there are some
>
> semantic issues here.
>
>
>
>  > 3) Is there anyway we can expedite these
> geoprocessing
>
>  > tasks(union/intersect etc) using big data platform (Ex:
> hadoop)? Most
>
>  > examples talk about analysis (contains etc)  but not
> about geoprocessing
>
>  > on geospatial data. Any input is appreciated.
>
>
>
> Lots of people do geoprocessing too with PostGIS, including
> long-running
>
> jobs on large volumes of data ( worldwide osm data
> processing namely).
>
> "Big data" is a really subjective word. Are your
> geoprocessing needs
>
> really parallelizable ? What kind of volumes are we talking
> about ? MB,
>
> GB, TB ? What kind of hardware do you have at hand ?
>
>
>
> One way to do some sort of map-reduce with PostGIS is to use
> a bunch of
>
> servers with FDW connections between a source master and
> these slaves,
>
> map the data processing to the slave servers and reduce it
> on the main
>
> server. With a bit of Python as glue code this can be
> automated and
>
> quite efficient, even though this kind of sharding is not
> automated (
>
> yet ?).
>
>
>
> Vincent
>
>
>
>  >
>
>  > Thanks,
>
>  > Ravi.
>
>  >
>
>  >
>
>  > _______________________________________________
>
>  > postgis-users mailing list
>
>  > postgis-users at lists.osgeo.org
>
>  > http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>  >
>
>
>
> _______________________________________________
>
> postgis-users mailing list
>
> postgis-users at lists.osgeo.org
>
> http://lists.osgeo.org/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the postgis-users mailing list