[postgis-users] Partitionning using geometry

Rémi Cura remi.cura at gmail.com
Sat Apr 4 03:52:53 PDT 2015


Hey,
I explore the possibility to store thousands of Billions of points into
postgres (without partitionning, 10's billion, with partitionning, 1000's
billions).
MongoDB is actively researched for this [1] and [2], with exciting
perspective, and several downfall as well.

I think you may mixing things.
When the data is important (and will _not_ be discarded), you need
guarantees (ACID). Postgres offers that.

Now when you have tons and tons of data, fast updating, don't really care
if few data get lost, __and__ need fast read/write, NoSQL is the current
easiest solution.

It is precisely about getting the right tool for the right job.

I use an ad-hoc solution for each of my use case (Billion point cloud, 100k
topology +graph optimisation, million vector, 100k km2 raster).


And as a philosophical matter, I strongly disagree with current trend for
massive low level data, and consider that the real challenge is to
understand the data rather than brute force it (and it is the meta
objective of my phd).


1] Martinez-Rubi, Oscar, Martin L. Kersten, Romulo Goncalves, and Milena
Ivanova. “A Column-Store Meets the Point Clouds.” *FOSS4G-Europe Academic
Track*, 2014.
http://europe.foss4g.org/2014/sites/default/files/11-Martinez-Rubi_0.pdf.
 [2] Van Oosterom, Peter, Oscar Martinez-Rubi, Milena Ivanova, Mike
Horhammer, Daniel Geringer, Siva Ravada, Theo Tijssen, Martin Kodde, and
Romulo Gonçalves. “Massive Point Cloud Data Management: Design,
Implementation and Execution of a Point Cloud Benchmark.” *Computers &
Graphics*, February 2015. doi:10.1016/j.cag.2015.01.007.



Cheers,
Rémi-C

2015-04-04 2:39 GMT+02:00 Mark Wynter <mark at dimensionaledge.com>:

> Remi, sounds like you're close to finding a pathway that matches your
> needs.
>
> there are always several ways to solve every problem. It seems yours is
> about storage and fast access to the bits that are relevant. You mention
> circa 1 thousand tiles.
>
> The only reason I'm pursuing this topic is because I'm sensing a broader
> solution to an emerging set of challenges. Particularly with the "internet
> of things", where we will collect seriously BIG spatio-temporal data - eg
> from sensors and we can't afford to pull everything into PostGIS.
>
> iOT requires solutions that can deal with datasets many orders of
> magnitude larger, but can also scale down to suit small projects.
>
> using Big Data /noSQL solutions to do some of the simple heavy lifting
>
>
> mongo db has got spatial capabilities And you can use the mongo FDW to
> return query results into PG.
>
>
> Mongo db when I last looked at this only has 2d spatial indexing.  Not
> ideal, but still useful for point cloud storage... I also think the
> attached article is handy because the author lays out the business problem
> and how he's tackled it.
>
>
> http://rs.tudelft.nl/~rlindenbergh/workshop/BoehmIQmulus.pdf
>
> I'm not advocating that people jump on the NoSQL bandwagon - I love
> Postgis and prefer to do as much of my work in postgis - but it's good to
> know each DBs respective strengths and why..
>
> IMHO, the quicker we become conversant inPolyglot DB design,  the better.
>
>
> If we think every problem is a nail and therefore postgis is the hammer /
> answer, then innovation ( which is about using and combining existing
> technologies in new ways) will leave us behind.
>
>
> Ok, time to park NoSQL and apply some left field thinking..
>
> take one of Paul's recent innovations...
>
> https://github.com/pramsey/pgsql-ogr-fdw
>
> What if we could read straight from file... Or "document" into PG...?  And
> I'm not talking every document - just the ones relevant to us... On an as
> needs basis.
>
> This is what the mongo_fdw does... In fact we can write back too.
>
> FDW's go to heart of polyglot design.
>
> The common theme for IOT is the need to leverage a cheap distributed file
> / document storage system to enable fast search and retrieval... With a
> basic level of spatial awareness that can inform and feed downstream
> applications incl, PostGIS.
>
> The idea is that stuff not needed or of no business value downstream gets
> left behind on the cheapest h/w, s/w before discarding.
>
> I'm keen to hear from anyone about how they are using postgis in
> combination with NoSQL DBs for their big data pipelines ?
>
> Or if there's another mailing list that covers spatial IT / spatial -
> temporal big data processing, I'd love to hear about it.
>
> Thanks
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150404/6e7a1baf/attachment.html>


More information about the postgis-users mailing list