<div dir="ltr"><div><div><div><div><div><div><div>Hey,<br></div>I explore the possibility to store thousands of Billions of points into postgres (without partitionning, 10's billion, with partitionning, 1000's billions).<br></div>MongoDB is actively researched for this [1] and [2], with exciting perspective, and several downfall as well.<br><br></div>I think you may mixing things. <br></div>When the data is important (and will _not_ be discarded), you need guarantees (ACID). Postgres offers that.<br><br></div>Now when you have tons and tons of data, fast updating, don't really care if few data get lost, __and__ need fast read/write, NoSQL is the current easiest solution.<br><br></div>It is precisely about getting the right tool for the right job.<br><br></div>I use an ad-hoc solution for each of my use case (Billion point cloud, 100k topology +graph optimisation, million vector, 100k km2 raster).<br><div><div><div><div><div><div><div><br><br></div><div>And as a philosophical matter, I strongly disagree with current trend for massive low level data, and consider that the real challenge is to understand the data rather than brute force it (and it is the meta objective of my phd).<br></div><div><br><br>1] Martinez-Rubi, Oscar, Martin L. Kersten, Romulo Goncalves, and Milena Ivanova. “A Column-Store Meets the Point Clouds.” <i>FOSS4G-Europe Academic Track</i>, 2014. <a href="http://europe.foss4g.org/2014/sites/default/files/11-Martinez-Rubi_0.pdf">http://europe.foss4g.org/2014/sites/default/files/11-Martinez-Rubi_0.pdf</a>.<div><div><div><div style="line-height:1.35;padding-left:2em" class="">
<span class="" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A%20column-store%20meets%20the%20point%20clouds&rft.jtitle=FOSS4G-Europe%20Academic%20Track&rft.aufirst=Oscar&rft.aulast=Martinez-Rubi&rft.au=Oscar%20Martinez-Rubi&rft.au=Martin%20L.%20Kersten&rft.au=Romulo%20Goncalves&rft.au=Milena%20Ivanova&rft.date=2014"></span>
<div class="">[2] Van Oosterom, Peter, Oscar Martinez-Rubi, Milena Ivanova, Mike Horhammer, Daniel Geringer, Siva Ravada, Theo Tijssen, Martin Kodde, and Romulo Gonçalves. “Massive Point Cloud Data Management: Design, Implementation and Execution of a Point Cloud Benchmark.” <i>Computers & Graphics</i>, February 2015. doi:10.1016/j.cag.2015.01.007.</div>
<span class="" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1016%2Fj.cag.2015.01.007&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Massive%20point%20cloud%20data%20management%3A%20Design%2C%20implementation%20and%20execution%20of%20a%20point%20cloud%20benchmark&rft.jtitle=Computers%20%26%20Graphics&rft.aufirst=Peter&rft.aulast=van%20Oosterom&rft.au=Peter%20van%20Oosterom&rft.au=Oscar%20Martinez-Rubi&rft.au=Milena%20Ivanova&rft.au=Mike%20Horhammer&rft.au=Daniel%20Geringer&rft.au=Siva%20Ravada&rft.au=Theo%20Tijssen&rft.au=Martin%20Kodde&rft.au=Romulo%20Gon%C3%A7alves&rft.date=2015-02&rft.issn=00978493&rft.language=en"></span>
</div><br><br><br></div><div>Cheers,<br></div><div>Rémi-C<br></div></div></div></div></div></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2015-04-04 2:39 GMT+02:00 Mark Wynter <span dir="ltr"><<a href="mailto:mark@dimensionaledge.com" target="_blank">mark@dimensionaledge.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto"><div><span></span></div><div><div><span></span></div><div><div><span></span></div><div>Remi, sounds like you're close to finding a pathway that matches your needs.</div><div><br></div><div>there are always several ways to solve every problem. It seems yours is about storage and fast access to the bits that are relevant. You mention circa 1 thousand tiles.</div><div><br></div><div>The only reason I'm pursuing this topic is because I'm sensing a broader solution to an emerging set of challenges. Particularly with the "internet of things", where we will collect seriously BIG spatio-temporal data - eg from sensors and we can't afford to pull everything into PostGIS.</div><div><br></div><div>iOT requires solutions that can deal with datasets many orders of magnitude larger, but can also scale down to suit small projects.</div><span class=""><div><br></div><div><blockquote type="cite"><font color="#000000"><span style="background-color:rgba(255,255,255,0)">using Big Data /noSQL solutions to do some of the simple heavy lifting</span></font></blockquote></div><div><br></div></span><div><span class=""><blockquote type="cite"><span>mongo db has got spatial capabilities And you can use the mongo FDW to return query results into PG.</span><br></blockquote><span></span><div><br></div></span>Mongo db when I last looked at this only has 2d spatial indexing. Not ideal, but still useful for point cloud storage... I also think the attached article is handy because the author lays out the business problem and how he's tackled it.</div><div><blockquote type="cite"><span></span></blockquote><div><br></div><span style="background-color:rgba(255,255,255,0)"><a href="http://rs.tudelft.nl/~rlindenbergh/workshop/BoehmIQmulus.pdf" target="_blank">http://rs.tudelft.nl/~rlindenbergh/workshop/BoehmIQmulus.pdf</a></span><br><br>I'm not advocating that people jump on the NoSQL bandwagon - I love Postgis and prefer to do as much of my work in postgis - but it's good to know each DBs respective strengths and why..<span class=""><br><br><blockquote type="cite"><span>IMHO, the quicker we become conversant inPolyglot DB design, the better.</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>If we think every problem is a nail and therefore postgis is the hammer / answer, then innovation ( which is about using and combining existing technologies in new ways) will leave us behind.</span><br></blockquote><div><br></div></span>Ok, time to park NoSQL and apply some left field thinking..</div><div><br></div><div>take one of Paul's recent innovations...</div><div><br></div><div><span style="background-color:rgba(255,255,255,0)"><a href="https://github.com/pramsey/pgsql-ogr-fdw" target="_blank">https://github.com/pramsey/pgsql-ogr-fdw</a></span></div><div><span style="background-color:rgba(255,255,255,0)"><br></span></div><div><span style="background-color:rgba(255,255,255,0)">What if we could read straight from file... Or "document" into PG...? And I'm not talking every document - just the ones relevant to us... On an as needs basis.</span></div><div><br></div><div>This is what the mongo_fdw does... In fact we can write back too.</div><div><br></div><div>FDW's go to heart of polyglot design.<br><br></div><div>The common theme for IOT is the need to leverage a cheap distributed file / document storage system to enable fast search and retrieval... With a basic level of spatial awareness that can inform and feed downstream applications incl, PostGIS.</div><div><br></div><div>The idea is that stuff not needed or of no business value downstream gets left behind on the cheapest h/w, s/w before discarding.</div></div><div><br></div><div>I'm keen to hear from anyone about how they are using postgis in combination with NoSQL DBs for their big data pipelines ?</div></div><div><br></div><div>Or if there's another mailing list that covers spatial IT / spatial - temporal big data processing, I'd love to hear about it. </div><div><br></div><div>Thanks</div></div></blockquote></div><br></div>