<div dir="ltr">> <span style="font-size:12.8000001907349px">Howard: About the Greyhoud+s3, but what storage solution do you use, it is not clear...mongodb? I mean where are the points stored? file-based, dbms? </span><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">The storage for <a href="http://iowalidar.com" target="_blank">iowalidar.com</a> is S3 - but could be any key/value storage system (filesystem, database, a back-end web server supporting PUT/GET, etc.).</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">There is a single "base" chunk containing some well-known number (N) of compressed points, which is stored as key "0".  </span><span style="font-size:12.8000001907349px">After that, there is a well-known chunk size (C).  So the next key after "0" is "N", containing C compressed points, so the subsequent keys are stringified integers following the form "N + C*x" for x >= 0.  The key for each chunk is the ID of the first point of that chunk, and the value is the compressed binary point data starting at that ID.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Currently the chunk size is determined by the ending level of the base depth split into quadrants.  For example say the base contains depth levels [0, 8), non-inclusive.  Then the chunk size will be level 8 of the 'tree' split into 4 chunks, so (4^8)/4 points.  Those chunks contain the four quadrants of the bounds of the entire set, at a single level of detail.  Continuing on with that chunk size creates a store where each subsequent tre</span><span style="font-size:12.8000001907349px">e level is split into 4 times the number of chunks as the previous level.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">From there, given a point ID, we can easily figure out the ID of its chunk and fetch it from S3 - typically the entire chunk will be used in a query since the client is traversing its own virtual tree by splitting the bounds and climbing upward in tree depth.  We are running Greyhound on an EC2 instance so the fetches from S3 are very fast.  The client specifies its queries via a bounding box and a depth level (or range of levels), from which we can, in parallel, fetch all chunks selected by this query and start streaming out the points within range.  A client could also do a bit more work and fetch directly by chunk ID, but we like the abstraction of using a depth level and bounding box to decouple the physical storage from the client queries.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">- Connor</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 20, 2015 at 9:03 AM, Oscar Martinez Rubi <span dir="ltr"><<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000">

    Hi,<span class=""><br>

    <br>

    <div>On 20-04-15 13:13, Rémi Cura wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-family:monospace,monospace">Hey Oscar,<br>

          <br>

        </div>

        <div class="gmail_default" style="font-family:monospace,monospace">I'm a really big fan

          of Lidar for archeological use, and integrating time into it

          is especially trendy and challenging. Registetring all point

          cloud together from different sources must have been really

          difficult.<br>

        </div>

      </div>

    </blockquote>

    <br></span>

    That is really tricky indeed! At NLeSC we worked on an automatic

    open-source alignment tool

    (<a href="https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py" target="_blank">https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py</a>)

    which works for some cases when aligning point clouds from

    archaeological monuments (from photogrametry) with a lidar dataset.

    For other cases we have a manual alignment tool that is a 3D desktop

    viewer using based on OpenSceneGraph (where also meshes and pictures

    can be displayed).<span class=""><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-family:monospace,monospace"><br>

           <br>

        </div>

        <div class="gmail_default" style="font-family:monospace,monospace">I contacted Potree

          developper one year ago to ask him if it was possible to

          modify it to read points in a DBMS (actually patch with LOD).<br>

        </div>

        <div class="gmail_default" style="font-family:monospace,monospace">He said it was

          possible and not too difficult.<br>

        </div>

        <div class="gmail_default" style="font-family:monospace,monospace">I don't know how much

          point output you get, but we demonstrated around 20kpts/s

          streaming to browser (with a lot of

          serialization/deserialization). Currently the upper limit for

          such output would be in the few hundred kpts/s if you send

          points, and in the few Million pts/s if you stream compressed

          patches.<br>

        </div>

      </div>

    </blockquote>

    <br></span>

    Currently we are getting around 200kpoints/sec using LAS format (not

    remember how much we got with LAZ) but we also have a no-so-good

    server...so I think same solution could give a bit more in other

    situations. Anyway if you say compressed patches in DB could deliver

    few millions/sec that should be more than enough! And would be nice

    to try! <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-family:monospace,monospace"><br>

        </div>

        <div class="gmail_default" style="font-family:monospace,monospace">Cheers<br>

        </div>

      </div>

    </blockquote>

    Regards,<br>

    <br>

    O.<div><div class="h5"><br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-family:monospace,monospace"><br>

        </div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">2015-04-20 11:58 GMT+02:00 Oscar

          Martinez Rubi <span dir="ltr"><<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>></span>:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div bgcolor="#FFFFFF" text="#000000"> Whoa!<br>

              <br>

              Thanks guys for all the material! I am now busy reading it

              all!<br>

              <br>

              Remi: I had to read your mail a few times ;) Great slides

              (I actually looked at all of them, very well done!) Very

              interesting topics you are researching!<br>

              <br>

              Howard: About the Greyhoud+s3, but what storage solution

              do you use, it is not clear...mongodb? I mean where are

              the points stored? file-based, dbms? <br>

              <br>

              Paul+Nouri: The geohash tree that Noury mentions is the

              ght compression in pgpointclouid, right? I tried it once

              but there was some limitation with the type of

              coordinates, they had to be long and lat so I guess there

              need to be a reference system transformation in between,

              right? Any place where I can find an example on how to use

              this?<br>

              <br>

              <br>

              At NLeSC for the visualization of data we are using a

              system based on the potree visualization (so, file-based)

              but I am very very interested on the stuff you are guys

              doing and I would love to be convinced that DBMS solutions

              can be really efficient for visualization as well (i think

              it is close now!). We choose file-based and potree because

              of the initial lack of LoD support in DBMS, the speed the

              file-based approach and the super compressed LAZ storage.<br>

              <br>

              To see what we have done so far:<br>

              <br>

              <a href="https://github.com/NLeSC/PattyVis" target="_blank">https://github.com/NLeSC/PattyVis</a><br>

              <a href="https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d" target="_blank">https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d</a><br>

              (see the video from 1:40 for the potree based

              visualization)<br>

              <br>

              One of the many reasons I would loved to be convinced that

              DBMS is that now we are considering how to visualize the

              640B AHN2 dataset, and in a pure file-based solution (like

              the potree) I fear that when restructuring the data to

              octree we would need a number of octree nodes/files

              probably larger than what ext4 can handle!. We will try, I

              let you know how that goes ;), but it would be really nice

              to have a efficient and fast DBMS-based alternative!<br>

              <br>

              I am very happy though with all the different work you are

              all doing and excited to see how fast things improve and

              evolve!! <br>

              <br>

              Keep on like this guys!<br>

              <br>

              Regards,<br>

              <br>

              O.

              <div>

                <div><br>

                  <br>

                  <br>

                  <div>On 17-04-15 19:01, Sabo, Nouri wrote:<br>

                  </div>

                  <blockquote type="cite">

                    <div>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi,</span></p>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Thank

                          you for sharing these ideas. Many of the ideas

                          can make improvements. In the prototype we

                          have developed at RNCan and that we mentioned

                          in the paper in attachment we have implemented

                          some of these concepts. For example, in the

                          prototype we are sorting points according to

                          the Morton pattern before creating blocks. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">And each block is composed only

                          of points that are spatially close, thereby

                          improving the level of compression. We also

                          use the properties of the Morton curve (Z

                          pattern) to do spatial queries using Geohash

                          as BBox. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Usually,

                          in Geohash based system the more the Geohash

                          prefixes for two points resemble one another,

                          the more they are spatially close to each

                          other. Unfortunately, this property is not

                          always complied with two points located on

                          either side of a subdivision line. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">For this reason we implemented a

                          neighbourhood based strategy to allow spatial

                          query based on the hash string. </span></p>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Also to improve the compression

                          and performance we can change the encoding of

                          Geohash. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Currently,

                          the hashes are encoded as base 32 strings,

                          which causes a lot of overhead (5 bits are

                          inflated in 8 bits of character). </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Unfortunately, the current libght

                          does not include all the concepts of

                          GeoHashTree. </span></p>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Oscar, I will read your paper and

                          get you back so we could continue to exchange.</span></p>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Kind

                          regards!</span></p>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Nouri,</span></p>

                      <p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" lang="EN-CA"> </span></p>

                      <p class="MsoNormal"><span lang="EN-CA"> </span></p>

                      <div>

                        <div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">

                          <p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" lang="EN-US">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" lang="EN-US"> Paul Ramsey [<a href="mailto:pramsey@cleverelephant.ca" target="_blank">mailto:pramsey@cleverelephant.ca</a>]

                              <br>

                              <b>Sent:</b> 17 avril 2015 06:56<br>

                              <b>To:</b> <a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a>;

                              Peter van Oosterom; Oscar Martinez Rubi;

                              Howard Butler; Rémi Cura<br>

                              <b>Cc:</b> Sabo, Nouri<br>

                              <b>Subject:</b> Re: [pgpointcloud] RLE and

                              SIGBITS heuristics</span></p>

                        </div>

                      </div>

                      <p class="MsoNormal"> </p>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi

                            Oscar, </span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">This

                            sounds like a slightly more sophisticated

                            version of the work done at Natural

                            Resources Canada for what they call “geohash

                            tree”. They did find that they got pretty

                            good compression (even with the simple

                            ascii-based key!) using the scheme, and it

                            did allow easy random access to subsets of

                            the data.</span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><a href="http://2013.foss4g.org/conf/programme/presentations/60/" target="_blank">http://2013.foss4g.org/conf/programme/presentations/60/</a></span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">The

                            downside was of course the cost of sorting

                            things in the first place, but for a

                            one-time cost on frequently accessed data,

                            it’s not a bad thing. The “libght” soft

                            dependency in pgpointcloud is to a (not so

                            great) implementation of the scheme that I

                            did for them a couple years ago. As a

                            scheme, I think it cuts against the idea of

                            having small patches that is core to the

                            pgpointcloud concept. It makes more and more

                            sense the larger your file is, in that it

                            gets greater and greater leverage for random

                            access.</span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">ATB,</span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">P.</span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                      </div>

                      <div>

                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">-- <br>

                            Paul Ramsey<br>

                            <a href="http://cleverelephant.ca" target="_blank">http://cleverelephant.ca</a></span></p>

                        <div>

                          <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><a href="http://postgis.net" target="_blank">http://postgis.net</a> </span></p>

                        </div>

                      </div>

                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                      <p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif";color:black">On

                          April 17, 2015 at 11:02:47 AM, Oscar Martinez

                          Rubi (<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>)

                          wrote:</span></p>

                      <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

                        <div>

                          <div>

                            <p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi,<br>

                                <br>

                                About the XYZ binding for better

                                compression. In our research in the NL

                                escience center and TU Delft we have

                                been thinking (not testing yet though)

                                about one possible approach for this.<br>

                                <br>

                                It is based on using space filling

                                curves. So, once you have the points

                                that go in a block you could compute the

                                morton/hilbert code of the XYZ. Since

                                all the points are close together such

                                codes will be extremely similar, so one

                                could store only the increments which

                                could fit in many few bits. We have not

                                tested or compared this with any of the

                                other compressions but we just wanted to

                                share it with you just in case you find

                                it useful!<br>

                                <br>

                                An additional improvement would be to

                                sort the points within the blocks

                                according to the morton code. Then, when

                                doing crop/filter operations in the

                                blocks one can use the morton codes for

                                the queries similarly to what we

                                presented in our papers with the flat

                                table (without blocks), I attach one of

                                them (see section 5.2). In a nutshell:

                                You convert the query region into a set

                                of quadtree/octree nodes which can be

                                also converted to morton code ranges

                                (thanks to relation between

                                morton/hilbert curve and a

                                quadtree/octree). You scale down the

                                ranges to increments (like you did when

                                storing the point of the block) and then

                                you simply do range queries in sorted

                                data with a binary algorithm. In this

                                way you avoid the decompression of the

                                morton code for most of the block. This

                                filtering is equivalent to a bbox filter

                                so it still requires a point in polygon

                                check for some of the points.<br>

                                <br>

                                Kind Regards,<br>

                                <br>

                                Oscar.<br>

                                <br>

                              </span></p>

                            <div>

                              <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">On

                                  16-04-15 18:15, Rémi Cura wrote:</span></p>

                            </div>

                            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

                              <div>

                                <div>

                                  <p class="MsoNormal"><span>epic fail !

                                      I had avoided html just for you</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>

                                </div>

                                <div>

                                  <p class="MsoNormal"><span><br>

                                         Dataset   |subset size  |

                                      compressing   | decompressing |<br>

                                                   |(Million

                                      pts)|(Million pts/s)|(Million

                                      pts/s)|<br>

                                      Lidar        |   473.3     |   

                                      4,49       |     4,67      |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>

                                </div>

                                <p class="MsoNormal"><span>21-atributes

                                    |   105.7     |    1,11       |    

                                    2,62      |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>

                                <div>

                                  <div>

                                    <p class="MsoNormal" style="margin-bottom:12.0pt"><span>Stereo

                                              |    70       |    2,44  

                                            |     7,38      |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>

                                  </div>

                                  <div>

                                    <p class="MsoNormal"><span>Cheers</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>

                                  </div>

                                </div>

                              </div>

                              <div>

                                <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                                <div>

                                  <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">2015-04-16

                                      17:42 GMT+02:00 Sandro Santilli

                                      <<a href="mailto:strk@keybit.net" target="_blank">strk@keybit.net</a>>:</span></p>

                                  <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">On

                                      Thu, Apr 16, 2015 at 05:30:12PM

                                      +0200, Rémi Cura wrote:<br>

                                      > OUps<br>

                                      ><br>

                                      > Dataset        |  subset

                                      size(Million pts) | compressing

                                      (Million pts/s) |<br>

                                      > decompressing (Million pts/s)<br>

                                      > Lidar           |           

                                      473.3                |           

                                         4,49<br>

                                      >               |           

                                       __4,67__<br>

                                      > 21 attributes |         

                                       105.7                 |<br>

                                      > 1,11                     |   

                                               2,62<br>

                                      > Stereo         |             

                                      70                  |             

                                        2,44<br>

                                      >                |           

                                       7,38<br>

                                      <br>

                                      These tables aren't really

                                      readable here.<br>

                                      Could you make sure to use a

                                      fixed-width font to write those

                                      tables<br>

                                      and to keep lines within 70

                                      columns at most ?<br>

                                      <br>

                                      --strk;</span></p>

                                </div>

                                <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                              </div>

                              <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><br>

                                  <br>

                                  <br>

                                </span></p>

                              <pre>_______________________________________________</pre>

                              <pre>pgpointcloud mailing list</pre>

                              <pre><a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a></pre>

                              <pre><a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud</a></pre>

                            </blockquote>

                            <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>

                            <div class="MsoNormal" style="text-align:center" align="center"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">

                                <hr align="center" size="2" width="100%">

                              </span></div>

                          </div>

                        </div>

                      </blockquote>

                    </div>

                  </blockquote>

                  <br>

                </div>

              </div>

            </div>

          </blockquote>

        </div>

        <br>

      </div>

    </blockquote>

    <br>

  </div></div></div>

<br>_______________________________________________<br>

pgpointcloud mailing list<br>

<a href="mailto:pgpointcloud@lists.osgeo.org">pgpointcloud@lists.osgeo.org</a><br>

<a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud</a><br></blockquote></div><br></div>