[postgis-users] [postgis-devel] Sharding Rasters in Postgis

Paul Ramsey pramsey at cleverelephant.ca
Wed Sep 24 10:38:20 PDT 2014


Basically, if your queries don’t need to find tiles that neighbor each other, you can distribute them “randomly” around the cluster, which means any given roll-up query can heat up every node at once, for maximum use of the cluster. That’s the way you want it to work. If you’re going to have multiple layers of rasters and do interactions between them (basic map algebra, etc) you can still get good efficiency just by ensuring that they are tiled on exactly the same basis, so any given tile will line up with all its friends from other layers. If, however, you’re going to need to run operations that need to find neighboring tiles (cost surfaces, slope calculations) then having the tiles randomly spread around won’t work so well.

For an OLAP server, being able to shard to that every node gets used during a query is optimal, particularly since, if you shard based on spatial location you’ll end up naturally with bottlenecks on hot nodes. (Like, you will tend to roll up on spatial areas, which means only the node that serves that area will end up hot during the query, and the rest of the nodes will be idle: the absolute worst utilization situation.)

P

-- 
Paul Ramsey
http://cleverelephant.ca
http://postgis.net

On September 24, 2014 at 6:26:26 AM, David Haynes II (dahaynes at umn.edu) wrote:

Not sure if I understand exactly what you mean by "spatial correlation"
All of the queries will be using spatial defined queries like ST_Intersects. The strategy that I am thinking of employing, basically divides the world into N zones, similar to UTM projection grids. Each zone will be exist have a defined spatial extent that will use "inheritance" / hashkey to identify this spatial extent. I am very new to this area as this is outside my expertise. If there is some literature you can point me to that might explain it better I would appreciate that. It would allow me to present the problem is a better way.

https://wiki.postgresql.org/wiki/HashTable

On Tue, Sep 23, 2014 at 4:14 PM, Paul Ramsey <pramsey at cleverelephant.ca> wrote:
If you’re just doing queries that don’t take advantage of spatial correlation at all, using any hashkey on the contents as the sharding key should work just great. Then you can use pl/proxy to run roll-ups against spatial polygons, etc, easily and know you won’t end up with hot nodes.

P

-- 
Paul Ramsey
http://cleverelephant.ca
http://postgis.net

On September 23, 2014 at 2:11:18 PM, David Haynes II (dahaynes at umn.edu) wrote:

Yes, we are going to use the more finished product Postgres-xl, but even that is not production ready. 
So I would think the most logical solution is to generate the process ourselves.

Mostly wondering how to distribute the tables to various locations, once I do the difficulty part of aligning tiles to exist with a defined coordinate space.

On Mon, Sep 22, 2014 at 11:12 AM, Rémi Cura <remi.cura at gmail.com> wrote:
Do you mean somethiong like postgres-xc :http://postgresxc.wikia.com/wiki/Postgres-XC_Wiki

Cheers,
Rémi-C

2014-09-22 17:04 GMT+02:00 David Haynes II <dahaynes at umn.edu>:
Hello,

I was wondering if there any helpful examples for distributing rasters tables in postgis. Are there other items that I need to consider? Will the PostGIS functions need to be re-written if we want to parallelize the processing, using something like pg_proxy? 

I came an example using a different software package, (slide 16)
http://spark-summit.org/wp-content/uploads/2014/07/Geotrellis-Adding-Geospatial-Capabilities-to-Spark-Ameet-Kini-Rob-Emanuele.pdf
I would prefer to stay within postgresql community.

--
David Haynes, Ph.D.
Research Associate Terra Populous
Minnesota Population Center

_______________________________________________
postgis-devel mailing list
postgis-devel at lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel


_______________________________________________
postgis-devel mailing list
postgis-devel at lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel



--
David Haynes, Ph.D.
Research Associate Terra Populous
Minnesota Population Center
_______________________________________________
postgis-devel mailing list
postgis-devel at lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel



--
David Haynes, Ph.D.
Research Associate Terra Populous
Minnesota Population Center
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20140924/66fa9667/attachment.html>


More information about the postgis-users mailing list